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BRIEF ON APPEAL 

Sir: 

Further to the Notice of Appeal filed March 17, 2003, and received by the USPTO on 
March 24, 2003, herewith are three copies of Appellants' Brief on Appeal. Authorized fees 
include the $ 320.00 fee for the filing of this Brief. 

This is an appeal from the decision of the Examiner finally rejecting claims 1-6 of the 
above-identified application. 

(1) REAL PARTY IN INTEREST 
The above-identified application is assigned of record Incyte Pharmaceuticals, Inc. (now 
Incyte Corporation, formerly known as Incyte Genomics, Inc. ) (Reel 012104, Frame 0087 ), 
which is the real party in interest herein. 
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(2) RELATED APPEALS AND INTERFERENCES 
Appellants, their legal representative and the assignee are not aware of any related 
appeals or interferences which will directly affect or he directly affected hy or have a bearing on 
the Board s decision in the instant appeal. 



Claims rejected: 
Claims allowed: 
Claims canceled: 
Claims withdrawn: 
Claims on Appeal: 



(3) STATUS OF THE CLAIMS 
Claims 1-6 
(none) 

Claims 13-20 
7-12 

Claims 1-6 (A copy of the claims on appeal, as amended, can be 
found in the attached Appendix). 



(4) STATUS OF AMENDMENTS AFTER FINAL 
The Amendment after Final Rejection under 37 C.F.R. §1.116 filed February 12, 2003 
has been entered for purposes of this appeal. See the Advisory Action, mailed March 4, 2003, 
indicating the Amendment would be entered upon filing of an appeal. 

(5) SUMMARY OF THE INVENTION 
Appellants' invention is directed to polynucleotides encoding a human Deafness/Dystonia 
peptide (DDP)-related mitochondrial import protein (TRP;SEQ ID NO: 1) based on a high level 
of sequence identity (85# sequence identity) to the DDP-related mitochondrial import protein, 
TIMM8b and the conservation of a zinc-binding sequence motif characteristic of mitochondrial 
import proteins. See specification, at p. 9. TIMM8b and related mitochondrial import proteins 
are described in the specification and the art of record as functioning in the transport of proteins 
from the cytoplasm to the mitochondrial inner membrane. See specification at p. 1 and Jin et al 
(1999), p. 259. The chromosomal localization of these proteins further indicates their likely 
involvement in autosomal recessive neurodegenerative disorders, such as autosomal recessive 
deafness, cerebellar ataxia, and muscular dystrophy. See specification at p. 2 and Jin et al supra 
p. 265. Northern analysis further shows the differential expression of polynucleotides encoding 
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TRP in cancers of the breast, ovaries, and kidney relative to the corresponding normal tissue. 
See specification at p. 9 and table 2. The claimed polynucleotides are therefore asserted to be 
useful in the diagnosis and treatment of certain cancers and neurodegenerative disorders, in 
monitoring therapeutic intervention for these diseases (specification, at pp. 17 and 18), and in 
toxicology testing and drug discovery (specification, at pp. 20, 21, and 22). 

THE FINAL REJECTION 
Claims 1-6 stand rejected under 35 U.S.C. 101 and 1 12, first paragraph, based on the 
allegation that the claimed invention lacks patentable utility. The rejection alleges in particular 
that: 

the claimed invention is drawn to an invention with no apparent or disclosed specific and 
substantial credible utility. The instant application has provided a description of an 
isolated DNA encoding a protein and the protein encoded thereby. The instant 
application does not disclose the biological role of this protein or its significance. The 
instant specification discloses that TRP of the instant invention is a TIMM8b-related 
protein based on sequence homology to the human TIMM8b protein, and that TIMM8b is 
the human homolog of DDP1 (deafness/dystonia peptide), which is associated with 
Mohr-Tranebjaerg syndrome, a progressive neurodegenerative disorder leading to 
deaftiess. The instant specification fails to provide any evidence or sound scientific 
reasoning that would support a conclusion that TRP of the instant invention, which bears 
859r identity to TIMM8b proteins, which in turn bare homology to the proteins encoded 
by the gene associated with Mohr-Tranebjaerg syndrome, would also be associated with 
Mohr-Tranebjaerg syndrome or with any or all of "various neurodegenerative and 
neuromuscular diseases involving defects in oxidative phosphorylation (Final Office 
Action, filed 12/17/2002, p. 3). 
• Applicants asserted use of the claimed polynucleotides in the diagnosis of cancers of the 
breast, ovaries, and kidney based on differential expression in these conditions is not 
supported by the data of Table 2 of the specification because this data is not definite. It is 
not clear what is the difference in degree of expression of TRP in cancer versus normal 
tissue samples (Final Office Action, p. 3-4). One skilled in the art readily recognizes 
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(the ) obviousness of the tact that during progressive proliferation of cancerous cells 
numerous proteins, most of which are ubiquitous, are overexpressed. It is clear that TRP 
is also expressed in the normal tissue and appear to be expressed at higher levels in 
certain cancerous cells, —the instant specification, as tiled, clearly fails to provide any 
guidance on how to use these markers for quantitative analysis (Advisory Action, tiled 
3/04/2003, p. 2). 

(6) ISSUES 

1. Whether claims 1-6 directed to TRP encoding polynucleotide sequences meet the 
utility requirement of 35 U.S.C. §101 , e.g., whether there is evidence that a 85 7r correlation 
between sequences/motifs of the protein coded for by the claimed nucleic and TIMM8b, a 
protein known to have utility for mitochondrial transport, and to be associated with autosomal 
recessive neurodegenerative disorders, demonstrates a "substantial likelihood" of utility under 35 
U.S.C. § 101. Whether the differential expression of the polynucleotides in cancers of the breast, 
kidney, and ovaries provides a substantial likelihood of utility in the use of the claimed 
polynucleotides in the detection and diagnosis of these diseases. 

2. Whether one of ordinary skill in the art would know how to use the claimed 
sequences, e.g., in toxicology testing, drug development, and the diagnosis of disease, so as to 
satisfy the enablement requirement of 35 U.S.C. §1 12, first paragraph. 

(1) GROUPING OF THE CLAIMS 

As to Issue 1 

All of the claims on appeal are grouped together. 
As to Issue 2 

All of the claims on appeal are grouped together. 

(8) APPELLANTS' ARGUMENTS 

The rejection of claims 1-6 is improper, as the inventions of those claims have a 
patentable utility as set forth in the instant specification, and/or a utility well known to one 
of ordinary skill in the art. 
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The invention at issue is a polynucleotide sequence corresponding to a gene that is 
expressed in humans. The novel polynucleotide codes tor a polypeptide demonstrated in the 
patent specification to be a member of the class of DDP related mitochondrial import proteins 
(DDP/TIM), whose biological functions include the import of certain transmembrane carrier 
proteins from the cytoplasm to the mitochondrial inner membrane. See Jin et al. (1999 ), Abstract, 
p. 259. As such, the claimed invention has numerous practical, beneficial uses in toxicology 
testing, drug development, and the diagnosis of disease, none of which requires knowledge of 
how the polypeptide coded for by the polynucleotide actually functions. 

Appellants submit with this brief the declaration of Bedilion describing some of the 
practical uses of the claimed invention in gene and protein expression monitoring applications. 
The Bedilion declaration demonstrates that the positions and arguments made by the Patent 
Examiner with respect to the utility of the claimed polynucleotide are without merit. 

The Bedilion declaration describes, in particular, how the claimed expressed 

polynucleotide can be used in gene expression monitoring applications that were well-known at 

the time the patent application was filed, and how those applications are useful in developing 

drugs and monitoring their activity. Dr. Bedilion states that the claimed invention is a useful tool 

when employed as a highly specific probe in a cDNA micro array: 

Persons skilled in the art would appreciate that cDNA microarrays that contained the SEQ 
ID NO:l-encoding polynucleotides would be a more useful tool than cDNA microarrays 
that did not contain the polynucleotides in connection with conducting gene expression 
monitoring studies on proposed (or actual) drugs for treating cancer and 
neurodegenerative disorders for such purposes as evaluating their efficacy and toxicity. 

The Patent Examiner does not dispute that the claimed polynucleotide can be used as a 
probe in cDNA microarrays and used in gene expression monitoring applications. Instead, the 
Patent Examiner contends that the claimed polynucleotide cannot be useful without precise 
knowledge of its biological function. But the law never has required knowledge of biological 
function to prove utility. It is the claimed invention's uses, not its functions, that are the subject 
of a proper analysis under the utility requirement. 

In any event, as demonstrated by the Bedilion declaration, the person of ordinary skill in 
the art can achieve beneficial results from the claimed polynucleotide in the absence of any 
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knowledge as to the precise function of the protein encoded by it. The uses of the claimed 
polynucleotide in gene expression monitoring applications are in fact independent of its precise 
function. Indeed, the Final Office Action fails to acknowledge, let alone address, the Hillman 
'1 17 disclosure that cDNA micro arrays can be used "to monitor the expression level of large 
numbers of genes simultaneously" for a number of purposes, including "to develop and monitor 
the activities of therapeutic agents." (Hillman '1 17 application at page 13.) 

Under the circumstances, Appellants are submitting with this Appeal Brief (in triplicate) 
a Declaration of Dr. Tod Bedilion under 37 C.F.R. § 1. 132 (the Bedilion Declaration). As we 
will show, the Bedilion Declaration and these further references shows the many substantial 
reasons why the Examiners positions and arguments, in particular with respect to the use of the 
SEQ ID NO: 1 encoding polynucleotides in a cDNA micro array, are without merit, and that the 
ignored toxicology disclosure should have been given additional and more adequate 
consideration. 

I. The Applicable Legal Standard 

To meet the utility requirement of sections 101 and 112 of the Patent Act, the patent 

applicant need only show that the claimed invention is "practically useful," Anderson v. Nana, 

480 F.2d 1392, 1397, 178 USPQ 458 (CCPA 1973) and confers a "specific benefit" on the 

pubUc. Brenner v. Manson, 383 U.S. 519, 534-35, 148 USPQ 689 (1966). As discussed in a 

recent Court of Appeals for the Federal Circuit case, this threshold is not high: 

An invention is "useful" under section 101 if it is capable of providing some identifiable 
benefit. See Brenner v. Manson, 383 U.S. 519, 534 [148 USPQ 689] (1966); Brooktree 
Corp. v. Advanced Micro Devices, Inc., 977 F.2d 1555, 1571 [24 USPQ2d 1401] (Fed. 
Cir. 1992) ("to violate Section 101 the claimed device must be totally incapable of 
achieving a useftil result"); Fuller v. Berger, 120 F. 274, 275 (7th Cir. 1903) (test for 
utility is whether invention "is incapable of serving any beneficial end"). 

Juicy Whip Inc. v. Orange Bang Inc., 51 USPQ2d 1700 (Fed. Cir. 1999). 

While an asserted utility must be described with specificity, the patent applicant need not 

demonstrate utility to a certainty. In Stiftung v. Renishaw PLC, 945 F.2d 1 173, 1 180, 

20 USPQ2d 1094 (Fed. Cir. 1991), the United States Court of Appeals for the Federal Circuit 

explained: 
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An invention need not he the best or only way to accomplish a certain result, and it need 
only he useful to some extent and in certain applications: "[T]he tact that an invention has 
only limited utility and is only operable in certain applications is not grounds for finding 
lack of utility/' Envirotech Corp. v. A I George. Inc., 730 F.2d 753, 762, 221 USPQ 473, 
480 (Fed. Cir. 1984). 

The specificity requirement is not, therefore, an onerous one. If the asserted utility is 
described so that a person of ordinary skill in the art would understand how to use the claimed 
invention, it is sufficiently specific. See Standard Oil Co. v. Montedison, S.p.a.. 212 U.S.P.Q. 
327, 343 (3d Cir. 198 1). The specificity requirement is met unless the asserted utility amounts to 
a "nebulous expression" such as "biological activity" or "biological properties" that does not 
convey meaningful information about the utility of what is being claimed. Cross v. lizuka, 
753 F.2d 1040, 1048 (Fed. Cir. 1985). 

In addition to conferring a specific benefit on the public, the benefit must also be 
"substantial." Brenner, 383 U.S. at 534. A "substantial" utility is a practical, "real-world" 
utility. Nelson v. Bowler, 626 F.2d 853, 856, 206 USPQ 881 (CCPA 1980). 

If persons of ordinary skill in the art would understand that there is a "well-established" 
utility for the claimed invention, the threshold is met automatically and the applicant need not 
make any showing to demonstrate utility. Manual of Patent Examination Procedure at 
§ 706.03(a). Only if there is no "well-established" utility for the claimed invention must the 
applicant demonstrate the practical benefits of the invention. Id. 

Once the patent applicant identifies a specific utility, the claimed invention is presumed 
to possess it. In re Cortnght, 165 F.3d 1353, 1357, 49 USPQ2d 1464 (Fed. Cir. 1999); In re 
Brana, 51 F.3d 1560, 1566; 34 USPQ2d 1436 (Fed. Cir. 1995). In that case, the Patent Office 
bears the burden of demonstrating that a person of ordinary skill in the art w ould reasonably 
doubt that the asserted utility could be achieved by the claimed invention, hi To do so, the 
Patent Office must provide evidence or sound scientific reasoning. See In re Lunger, 503 F.2d 
1380, 1391-92, 183 USPQ 288 (CCPA 1974). If and only if the Patent Office makes such a 
showing, the burden shifts to the applicant to provide rebuttal evidence that would convince the 
person of ordinary skill that there is sufficient proof of utility. Brana, 51 F.3d at 1566. The 
applicant need only prove a "substantial likelihood" of utility; certainty is not required. Brenner, 
383 U.S. at 532. 
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II. Toxicology testing, drug discovery, and disease diagnosis are sufficient utilities 
under 35 U.S.C. 101 and 112, first paragraph 

The claimed invention meets all of the necessary requirements tor establishing a credible 
utility under the Patent Law: There are "well-established" uses for the claimed invention known 
to persons of ordinary skill in the art, and there are specific practical and beneficial uses for the 
invention disclosed in the patent application's specification. These uses are explained, in detail, 
in the Bedilion declaration accompanying this brief Objective evidence, not considered by the 
Patent Office, further corroborates the credibility of the asserted utilities. 

A. The use of TRP encoding polynucleotides for toxicology testing, drug 
discovery, and disease diagnosis are practical uses that confer "specific 
benefits" to the public 

The claimed invention has specific, substantial, real-world utility by virtue of its use in 
toxicology testing, drug development and disease diagnosis through gene expression profiling. 
These uses are explained in detail in the accompanying Bedilion declaration, the substance of 
which is not rebutted by the Patent Examiner. There is no dispute that the claimed invention is in 
fact a useful tool in cDNA micro arrays used to perform gene expression analysis. That is 
sufficient to establish utility for the claimed polynucleotide. 

In his Declaration, Dr. Bedilion explains the many reasons why a person skilled in the art 
reading the Hillman '1 17 application on February 8, 2001 would have understood that application 
to disclose the claimed polynucleotide to be useful for a number of gene expression monitoring 
applications, e.g., as a highly specific probe for the expression of that specific polynucleotide in 
connection with the development of drugs and the monitoring of the activity of such drugs. 
(Bedilion Declaration at, e.g., f\\ 10-15). Much, but not all, of Dr. Bedilion' s explanation 
concerns the use of the claimed polynucleotide in cDNA microarrays of the type first developed 
at Stanford University for evaluating the efficacy and toxicity of drugs, as well as for other 
applications. (Bedilion Declaration, ffl 12 and 15). 1 



'Dr. Bedilion also explained, for example, why persons skilled in the art would also 
appreciate, based on the Hillman 1 17 specification, that the claimed polynucleotide would be 
useful in connection with developing new drugs using technology, such as Northern analysis, that 
predated by many years the development of the cDNA technology (Bedilion Declaration, ( ]1 16). 
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In connection with his explanations. Dr. Bedilion states that the "Hillman '1 17 
specification would have led a person skilled in the art on February 8, 2001 who was using gene 
expression monitoring in connection with working on developing new drugs for the treatment of 
cancers and neurodegenerative disorders [a) to conclude that a cDNA microarray that contained 
the SEQ ID NO: 1 -encoding polynucleotides would be a highly useful tool, and [b] to request 
specifically that any cDNA microarray that was being used for such purposes contain the SEQ ID 
NO: 1 -encoding polynucleotides'* (Bedilion Declaration, % 15 ). For example, as explained by Dr. 
Bedilion, "[pjersons skilled in the art would [have appreciated on February 8, 2001 that a cDNA 
microarray that contained the SEQ ID NO: 1 -encoding polynucleotides would be a more useful 
tool than a cDNA microarray that did not contain the polynucleotides in connection with 
conducting gene expression monitoring studies on proposed (or actual) drugs for treating cancers 
and neurodegenerative disorders for such purposes as evaluating their efficacy and toxicity." Id. 

In support of those statements, Dr. Bedilion provided detailed explanations of how cDNA 
technology can be used to conduct gene expression monitoring evaluations, with extensive 
citations to pre-February 8, 2001 publications showing the state of the art on February 8, 2001. 
(Bedilion Declaration, 1 % .10-14). While Dr. Bedilion' s explanations in paragraph 15 of his 
Declaration include almost four pages of text and seven subparts [(a)-(g)|, he specifically states 
that his explanations are not "all-inclusive/' Id For example, with respect to toxicity 
evaluations, Dr. Bedilion had earlier explained how persons skilled in the art who were working 
on drug development on February 8, 2001(and for several years prior to February 8, 2001) 
"without any doubt appreciated that the toxicity (or lack of toxicity) of any proposed drug they 
were working on was one of the most important criteria to be evaluated in connection with the 
development of the drug" and how the teachings of the Hillman '117 application clearly include 
using differential gene expression analyses in toxicity studies (Bedilion Declaration, \ 10). 

Thus, the Bedilion Declaration establishes that persons skilled in the art reading the 
Hillman '117 application at the time it was tiled "would have wanted their cDNA microarray to 
have a SEQ ID NO: 1 -encoding polynucleotide probe because a microarray that contained such a 
probe (as compared to one that did not) would provide more useful results in the kind of gene 
expression monitoring studies using cDNA microarrays that persons skilled in the art have been 
doing since well prior to February 8, 2001" (Bedilion Declaration, \ 15, item (g)). This, by itself, 
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provides more than sufficient reason to compel the conclusion that the Hillman 1 17 application 
disclosed to persons skilled in the art at the time of its filing substantial, specific and credible 
real-world utilities for the claimed polynucleotide. 

Nowhere does the Patent Examiner address the fact that, as described on pp. 12-13 of the 
Hillman '1 17 application, the claimed polynucleotides can be used as highly specific probes in, 
for example, cDNA microarrays - probes that without question can be used to measure both the 
existence and amount of complementary RNA sequences known to be the expression products of 
the claimed polynucleotides. The claimed invention is not, in that regard, some random sequence 
whose value as a probe is speculative or would require further research to determine. 

Given the fact that the claimed polynucleotide is known to be expressed, its utility as a 
measuring and analyzing instrument for expression levels is as indisputable as a scale's utility for 
measuring weight. This use as a measuring tool regardless of how the expression level data 
ultimately would be used by a person of ordinary skill in the art, by itself demonstrates that the 
claimed invention provides an identifiable, real-world benefit that meets the utility requirement. 
Raytheon v. Roper, 724 F.2d 95 1, ( Fed. Cir. 1983) (claimed invention need only meet one of its 
stated objectives to be useful); In re Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999) (how the 
invention works is irrelevant to utility); MPEP § 2107 ("Many research tools such as gas 
chromatographs, screening assays, and nucleotide sequencing techniques have a clear, specific, 
and unquestionable utility (e.g., they are useful in analyzing compounds )" (emphasis added)). 

Though appellants need not so prove to demonstrate utility, there can be no reasonable 
dispute that persons of ordinary skill in the art have numerous uses for information about relative 
gene expression including, for example, understanding the effects of a potential drug for treating 
cancer and neurodegenerative disorders. Because the patent application states explicitly that the 
claimed polynucleotide is known to be expressed both in normal cells as well as cancerous and 
immortalized cells (see the Hillman '1 17 application at p. 9), and expresses a protein that is a 
member of a class DDP related mitochondrial import proteins known to be associated with 
autosomal neurodegenerative disorders (see the Hillman '117 application at p. 1 and Jin et al, 
1999), there can be no reasonable dispute that a person of ordinary skill in the art could put the 
claimed invention to such use. In other words, the person of ordinary skill in the art can derive 
more information about a potential drug candidate for cancer or a neurodegenerative disorder, or 
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a potential toxin with the claimed invention than without it (see Bedilion Declaration at, e.g., 

( ([ 15, subparts [(e)-(g)|). 

The Bedilion Declaration shows that a number of pre -February 8, 2001 publications 

confirm and further establish the utility of cDNA microarrays in a wide range of drug 

development gene expression monitoring applications at the time the Hillman '117 application 

was filed (Bedilion Declaration ffl 10-14; Bedilion Exhibits A-G). Indeed, Brown and Shalon 

U.S. Patent No. 5,807.522 (the Brown '522 patent, Bedilion Exhibit D), which issued from a 

patent application filed in June 1995 and was effectively published on December 29, 1995 as a 

result of the publication of a PCT counterpart application, shows that the Patent Office 

recognizes the patentable utility of the cDNA technology developed in the early to mid-1990s. 

As explained by Dr. Bedilion, among other things (Bedilion Declaration, { \[ 12): 

The Brown '522 patent further teaches that the "[mjicroarrays of immobilized 
nucleic acid sequences prepared in accordance with the invention" can be used in 
"numerous" genetic applications, including "monitoring of gene expression * 
applications (see Bedilion Tab D at col. 14, lines 36-42). The Brown '522 patent 
teaches (a) monitoring gene expression (i) in different tissue types, (ii) in different 
disease states, and (iii) in response to different drugs, and (b) that arrays disclosed 
therein may be used in toxicology studies (see Bedilion Tab D at col. 15, lines 13- 
18 and 52-58 and col. 18, lines 25-30). 

Literature reviews published shortly after the filing of the Hillman T 17 application 

describing the state of the art further confirm the claimed invention's utility. Rockett et al. 

confirm, for example, that the claimed invention is useful for differential expression analysis 

regardless of how expression is regulated: 

Despite the development of multiple technological advances which have recently 
brought the field of gene expression profiling to the forefront of molecular 
analysis, recognition of the importance of differential gene expression and 
characterization of differentially expressed genes has existed for many years. 

h= ^ * 

Although differential expression technologies are applicable to a broad range of 
models, perhaps their most important advantage is that, in most cases, absolutely 
no prior knowledge of the specific genes which are up- or down-regulated is 
required. 

^ ^ ^ 



109255 



09/781,117 



Docket No.: PC-0034 I S 



Whereas it would be informative to know the identity and functionality of all 
genes up/down regulated by . . . toxicants, this would appear a longer term goal 
.... However, the current use of gene profiling yields a pattern of gene changes 
for a xenobiotic of unknown toxicity which may be matched to that of well 
characterized toxins, thus alerting the toxicologist to possible in vivo similarities 
between the unknown and the standard, thereby providing a platform tor more 
extensive toxicological examination, (emphasis added) 

Rockett et al., Differential gene expression in ding metabolism and toxicology: practicalities, 

problems and potential , 29 Xenobiotica No. 7, 655 (1999). 

In another pre-February 2001 article, Lashkuri et al. state explicitly that sequences that are 

merely ''predicted" to be expressed (predicted Open Reading Frames, or ORFs) - the claimed 

invention in fact is known to be expressed - have numerous uses: 

Efforts have been directed toward the amplification of each predicted ORF or any 
other region of the genome ranging from a few base pairs to several kilobase 
pairs. There are many uses for these amplicons- they can be cloned into standard 
vectors or specialized expression vectors, or can be cloned into other specialized 
vectors such as those used for two-hybrid analysis. The amplicons can also be 
used directly by, for example, arraying onto glass for expression analysis , for 
DNA binding assays, or for any direct DNA assay. 

Lashkari et al. , Whole genome analysis: Experimental access to all genome sequenced segments 
through larger-scale efficient oligonucleotide synthesis and PCR , 94 Proc. Nat. Acad. Sci. 8945 
(Aug. 1997) (emphasis added). 

B. The use of nucleic acids coding for proteins expressed by humans as tools for 
toxicology testing, drug discovery, and the diagnosis of disease is now "well- 
established" 

The technologies made possible by expression profiling and the DNA tools upon which 
they rely are now well-established. The technical literature recognizes not only the prevalence of 
these technologies, but also their unprecedented advantages in drug development, testing and 
safety assessment. These technologies include toxicology testing, as described by Bedilion in his 
declaration. 

Toxicology testing is now standard practice in the pharmaceutical industry. See, e.g., 
John C. Rockett et al.. supra: 
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Knowledge of toxin-dependent regulation m target tissues is not solely an academic 
pursuit as much interest has been generated in the pharmaceutical industry to harness this 
technology in the early identification of toxic drug candidates, thereby shortening the 
developmental process and contributing substantially to the safety assessment of new 
drugs. 

To the same effect are several other scientific publications, including Emile F. Nuwaysir et aL 

Microarravs and Toxicology: The Advent of Toxicogenomics , 24 Molecular Carcinogenesis 153 

(1999); Sandra Sterner and N. Leigh Anderson, Expression profiling in toxicology -- potentials 

and limitations , 112-13 Toxicology Letters 467 (2000). 

Nucleic acids useful for measuring the expression of whole classes of genes are routinely 

incorporated for use in toxicology testing. Nuwaysir et al. describes, for example, a Human 

ToxChip comprising 2089 human clones, winch were selected 

for their well-documented involvement in basic cellular processes as well as their 
responses to different types of toxic insult. Included on this list are DNA replication and 
repair genes, apoptosis genes, and genes responsive to PAHs and dioxin-like compounds, 
peroxisome pro Iterators, estrogenic compounds, and oxidant stress. Some of the other 
categories of genes include transcription factors, oncogenes, tumor suppressor genes, 
cyclins, kinases, phosphatases, cell adhesion and motility genes, and homeobox genes. 
Also included in this group are 84 housekeeping genes, whose hybridization intensity is 
averaged and used for signal normalization of the other genes on the chip. 

See also Table 1 of Nuwaysir et al. (listing additional classes of genes deemed to be of special 

interest in making a human toxicology microarray). 

The more genes that are available for use in toxicology testing, the more powerful the 
technique. "Arrays are at their most powerful when they contain the entire genome of the species 
they are being used to study." John C. Rockett and David J. Dix, Application of DNA Arrays to 
Toxicology , 107 Environ. Health Perspec.681, No. 8 (1999 ). Control genes are carefully selected 
for their stability across a large set of array experiments in order to best study the effect of 
toxicological compounds. See attached email from the primary investigator on the Nuwaysir 
paper, Dr. Cynthia Afshari, to an Incyte employee, dated July 3, 2000, as well as the original 
message to which she was responding, indicating that even the expression of carefully selected 
control genes can be altered. Thus, there is no expressed gene which is irrelevant to screening 
for toxicological effects, and all expressed genes have a utility for toxicological screening. 

In fact, the potential benefit to the public, in terms of lives saved and reduced health care 
costs, are enormous. Recent developments provide evidence that the benefits of this information 
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are already beginning to manifest themselves. Examples include the following: 

• In 1999, CV Therapeutics, an Incyte collaborator, was able to use Incyte gene 
expression technology, information about the structure of a known transporter 
gene, and chromosomal mapping location, to identify the key gene associated with 
Tangiers disease. This discovery took place over a matter of only a few weeks, 
due to the power of these new genomics technologies. The discovery received an 
award from the American Heart Association as one of the top 10 discoveries 
associated with heart disease research in 1999. 

• In an April 9, 2000, article published by the Bloomberg news service, an Incyte 
customer stated that it had reduced the time associated with target discovery and 
validation from 36 months to 18 months, through use of Incyte \s genomic 
information database. Other Incyte customers have privately reported similar 
experiences. The implications of this significant saving of time and expense for 
the number of drugs that may be developed and their cost are obvious. 

• In a February 10, 2000, article in the Wall Street Journal, one Incyte customer 
stated that over 50 percent of the drug targets in its current pipeline were derived 
from the Incyte database. Other Incyte customers have privately reported similar 
experiences. By doubling the number of targets available to pharmaceutical 
researchers, Incyte genomic information has demonstrably accelerated the 
development of new drugs. 

Because the Patent Examiner failed to address or consider the "well-established" utilities 
for the claimed invention in toxicology testing, drug development, and the diagnosis of disease, 
the Examiner's rejections should be overturned regardless of their merit. 

C. The similarity of the polypeptide encoded by the claimed invention to 
another polypeptide of undisputed utility demonstrates utility 

In addition to having substantial, specific and credible utilities in numerous gene 
expression monitoring applications, the utility of the claimed polynucleotide can be imputed 
based on the relationship between the polypeptide it encodes, TRP, and another polypeptide of 
unquestioned utility, TIMM8b. The two polypeptides have sufficient similarities in their 
sequences that a person of ordinary skill in the art would recognize more than a reasonable 
probability that the polypeptide encoded for by the claimed invention has utility similar to 
TIMM8b. Appellant need not show any more to demonstrate utility. In re Brunei, 51 F.3d at 
1567. 



109255 



14 



09/781,117 



Docket No.: PC-0034 US 

It is undisputed, and readily apparent from the patent application, that the polypeptide 
encoded for by the claimed polynucleotide shares more than $5 <7 c sequence identity over 98 
amino acid residues with T1MM8K See specification, at p. 9. The two proteins are, in fact, 100 r /T 
identical over all but the first 15 N-terminal amino acids and share the CX^CX 14 CX,C motif 
characteristic of members of the DDP/TIM family of mitochondrial proteins This is more than 
enough homology to demonstrate a reasonable probability that the utility of TIMMSb can be 
imputed to the claimed invention (through the polypeptide it encodes). It is well-known that the 
probability that two unrelated polypeptides share more than 40 r /c sequence homology over 70 
amino acid residues is exceedingly small. Brenner et al., Proc. Natl. Acad. Sci. 95:6073-78 
(1998), see Response to Office Action, filed 10/10/2002. Given homology in excess of 40'7r over 
many more than 70 amino acid residues, the probability that the polypeptide encoded for by the 
claimed polynucleotide is related to TIMM8b is, accordingly, very high. 

The Examiner must accept the applicants' demonstration that the homology between the 
polypeptide encoded for by the claimed invention and TIMM8b demonstrates utility by a 
reasonable probability unless the Examiner can demonstrate through evidence or sound scientific 
reasoning that a person of ordinary skill in the art would doubt utility. See In re Lunger, 503 
F.2d 1380, 1391-92, 183 USPQ 288 (CCPA 1974). The Examiner has not provided sufficient 
evidence or sound scientific reasoning to the contrary. 

While the Examiner has cited literature identifying some of the difficulties that may be 
involved in predicting protein function, none suggests that functional homology cannot be 
inferred by a reasonable probability in this case. See Skolnick et al. (2000) and Bork et al. (1998), 
Office Action filed 7/1 1/2002. Most important, none contradicts Brenner's basic rule that 
sequence homology in excess of 409f over 70 or more amino acid residues yields a high 
probability of functional homology as well. Nor do they contradict the significance of applicants 
disclosure regarding the shared homology of the CX ? CX 14 CX^C motif between the two proteins. 
At most, these articles individually and together stand for the proposition that it is difficult to 
make predictions about function with certainty. The standard applicable in this case is not, 
however, proof to certainty, but rather proof to reasonable probability. 
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D. Objective evidence corroborates the utilities of the claimed invention 

There is, in tact, no restriction on the kinds of evidence a Patent Examiner may consider 
in determining whether a "real-world" utility exists. Indeed, "real-world" evidence, such as 
evidence showing actual use or commercial success of the invention, can demonstrate conclusive 
proof of utility. Raytheon w Roper, 220 USPQ2d 592 (Fed. Cir. 1983); Nestle v. Eugene, 55 
F.2d 854, 856, 12 USPQ 335 (6th Cir. 1932). Indeed, proof that the invention is made, used or 
sold by any person or entity other than the patentee is conclusive proof of utility. United States 
Steel Corp. v. Phillips Petroleum Co., 865 F.2d 1247, 1252, 9 USPQ2d 1461 (Fed. Cir. 1989). 

Over the past several years, a vibrant market has developed for databases containing all 
expressed genes (along with the polypeptide translations of those genes), in particular genes 
having medical and pharmaceutical significance such as the instant sequence. (Note that the 
value in these databases is enhanced by their completeness, but each sequence in them is 
independently valuable.) The databases sold by Appellants' assignee, Incyte, include exactly the 
kinds of information made possible by the claimed invention, such as tissue and disease 
associations. Incyte sells its database containing the claimed sequence and millions of other 
sequences throughout the scientific community, including to pharmaceutical companies who use 
the information to develop new pharmaceuticals. 

Both Incyte' s customers and the scientific community have acknowledged that Ineyte's 
databases have proven to be valuable in, for example, the identification and development of drug 
candidates. As Incyte adds information to its databases, including the information that can be 
generated only as a result of Incyte's discovery of the claimed polynucleotide and its use of that 
polynucleotide on cDNA microarrays, the databases become even more powerful tools. Thus the 
claimed invention adds more than incremental benefit to the drug discovery and development 
process. 

III. The Patent Examiner's Rejections Are Without Merit 

Rather than responding to the evidence demonstrating utility, the Examiner attempts to 
dismiss it altogether by arguing that the disclosed and well-established utilities for the claimed 
polynucleotide are not "specific, substantial, and credible" utilities. Office Action filed 
7/1 1/2002, p. 3. The Examiner is incorrect both as a matter of law and as a matter of fact. 

109255 16 09/781,117 



Docket No.: PC-0034 US 



A. The Precise Biological Role Or Function Of An Expressed Polynucleotide Is 
Not Required To Demonstrate Utility 

The Patent Examiner's primary rejection of the claimed invention is based on the ground 
that, without information as to the precise "biological role" of the claimed invention, the claimed 
invention's utility is not sufficiently specific. See Office Action filed 7/1 1/2002, p. 4. According 
to the Examiner, it is not enough that a person of ordinary skill in the art could use and. in fact, 
would want to use the claimed invention either by itself or in a cDNA micro array to monitor the 
expression of genes for such applications as the evaluation of a drug's efficacy and toxicity. The 
Examiner would require, in addition, that the applicant provide a specific and substantial 
interpretation of the results generated in any given expression analysis. 

It may be that specific and substantial interpretations and detailed information on 
biological function are necessary to satisfy the requirements for publication in some technical 
journals, but they are not necessary to satisfy the requirements for obtaining a United States 
patent. The relevant question is not, as the Examiner would have it, whether it is known how or 
why the invention works, In re Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999), but rather 
whether the invention provides an "identifiable benefit" in presently available form Juicy Whip 
Inc. v. Orange Bang Inc., 185 F.3d 1364, 1366 (Fed. Cir. 1999). If the benefit exists, and there is 
a substantial likelihood the invention provides the benefit, it is useful. There can be no doubt, 
particularly in view of the Bedilion Declaration (at, e.g., ffl 10 and 15, Bedilion), that the present 
invention meets this test. 

The threshold for determining whether an invention produces an identifiable benefit is 
low. Juicy Whip, 185 F.3d at 1366. Only those utilities that are so nebulous that a person of 
ordinary skill in the art would not know how to achieve an identifiable benefit and, at least 
according to the PTO guidelines, so-called "throwaway " utilities that are not directed to a person 
of ordinary skill in the art at all, do not meet the statutory requirement of utility. Utility 
Examination Guidelines, 66 Fed. Reg. 1092 (Jan. 5, 2001). 

Knowledge of the biological function or role of a biological molecule has never been 
required to show real-world benefit. In its most recent explanation of its own utility guidelines, 
the PTO acknowledged so much (66 F.R. at 1095): 
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|TJhe utility of a claimed DNA does not necessarily depend on the function of the 
encoded gene product. A claimed DNA may have specific and substantial utility 
because, e.g.. it hybridizes near a disease-associated gene or it has gene-regulating 
activity. 

By implicitly requiring knowledge of biological function for any claimed nucleic acid, the 
Examiner has, contrary to law, elevated what is at most an evidentiary factor into an absolute 
requirement of utility. Rather than looking to the biological role or function of the claimed 
invention, the Examiner should have looked first to the benefits it is alleged to provide. 

B. Membership in a Class of Useful Products Can Be Proof of Utility 

Despite the uncontradicted evidence that the claimed polynucleotide encodes a 
polypeptide in the DDPATIM family of mitochondrial import proteins, the Examiner refused to 
impute the utility of the members of the DDPATIM family to TRP. In the Office Action, the 
Patent Examiner takes the position that, unless appellants can identify which particular biological 
function within the class of DDP/TIM is possessed by TRP, utility cannot be imputed. To 
demonstrate utility by membership in the class of DDP/TIM, the Examiner would require that all 
DDP/TIM proteins possess a "common" utility. 

There is no such requirement in the law. In order to demonstrate utility by membership in 
a class, the law requires only that the class not contain a substantial number of useless members. 
So long as the class does not contain a substantial number of useless members, there is sufficient 
likelihood that the claimed invention will have utility, and a rejection under 35 U.S.C. § 101 is 
improper. That is true regardless of how the claimed invention ultimately is used and whether or 
not the members of the class possess one utility or many. See Brenner v. Manson. 383 U.S. 519, 
532 (1966); Application of Kirk, 376 F.2d 936, 943 (CCPA 1967). 

Membership in a "general" class is insufficient to demonstrate utility only if the class 
contains a sufficient number of useless members such that a person of ordinary skill in the art 
could not impute utility by a substantial likelihood. There would be, in that case, a substantial 
likelihood that the claimed invention is one of the useless members of the class. In the few cases 
in which class membership did not prove utility by substantial likelihood, the classes did in fact 
include predominately useless members. E.g., Brenner (man-made steroids); Kirk (same); Nafta 
(man-made polyethylene polymers). 
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The Examiner addresses TRP as if the general class in which it is included is not the 
DDP/TIM family, hut rather all polynucleotides or all polypeptides, including the vast majority 
of useless theoretical molecules not occurring in nature, and thus not pre-selected by nature to be 
useful. While these "general classes " may contain a substantial number of useless members, the 
DDPATIM family does not. The DDP/TIM family is sufficiently specific to rule out any 
reasonable possibility that TRP would not also be useful like the other members of the family. 

Because the Examiner has not presented any evidence that the DDP/TIM class of 
mitochondrial import proteins has any, let alone a substantial number, of useless members, the 
Examiner must conclude that there is a "substantial likelihood" that the TRP encoded by the 
claimed polynucleotide is useful. It follows that the claimed polynucleotide also is useful. 

C. Because the uses of TRP in toxicology testing, drug discovery , and disease 
diagnosis are practical uses beyond mere study of the invention itself, the 
claimed invention has substantial utility. 

The PTO rejected the claims at issue on the ground that the use of an invention as an 
"object of further research" is not a substantial use. See Office Action filed 7/1 1/2002, p. 6. 

As used in toxicology testing, drug discovery, and disease diagnosis, the claimed 
invention has a beneficial use in research other than studying the claimed invention or its protein 
products. It is a tool, rather than an object, of research. The data generated in gene expression 
monitoring using the claimed invention as a tool is not used merely to study the claimed 
polynucleotide itself, but rather to study properties of tissues, cells, and potential drug candidates 
and toxins. Without the claimed invention, the information regarding the properties of tissues, 
cells, drug candidates and toxins is less complete. [Bedilion Declaration at % 15. | 

The claimed invention has numerous additional uses as a research tool, each of which 
alone is a "substantial utility." These include, e.g., for chromosomal markers, probes, and in 
forensics. 

D. The Patent Examiner Failed to Demonstrate That a Person of Ordinary Skill 
in the Art Would Reasonably Doubt the Utility of the Claimed Invention 

In addition to appellants arguments presented above regarding the patent examiner's 

failure to establish a reasonable doubt for the utility of the claimed invention based sequence 

homology to DDP/TIM proteins, the examiner has also failed to present sufficient evidence or 
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sound scientific reasoning why one skilled in the art would douht applicants asserted utility of the 
claimed invention in the detection and diagnosis of certain cancers based on differential 
expression of the polynucleotide expressing TRP in these disorders. 

Applicants have disclosed in the specification at p. 9 and in table 2 that SEQ ID NO:2, 
encoding TRP, was "overexpressed", i.e., was found in an abundance of at least one 
transcript/library, in cancerous cDNA libraries of the breast, ovaries, and kidney, but was not 
found in the correspondingly normal tissue libraries. In particular, the specification discloses that 
the breast and kidney tumor libraries in which the transcript was found (BRSTTUT14, 
KIDNTUT14, and KIDNTUT15). were matched with normal tissue libraries from the same 
donor (BRSTNOT14, KINNNOT20, and KJDNNOT19, respectively) in which expression was 
undetectable. While the ovarian tumor samples were not matched with normal tissue libraries 
from the same donor, it is clear from Table 1 that over 100 cDNA libraries were examined in the 
category of "Female Reproductive" (which includes ovary, cervical, and uterine tissues) and that 
no expression of TRP in normal ovarian tissue libraries w as found. 

The Examiner does not dispute these facts, but merely alleges that the data of table 2 "is 
not definite". See Final Office Action, filed 12/17/2002, p. 3. As discussed above in section 1, 
there is no requirement in the law for "certainty" or "definiteness" in asserting a utility, applicant 
need only prove a "substantial likelihood" of utility. The Examiner merely alleges that a "single 
sample" or "less than three" would require "more statistical data in order to make a sound 
scientific conclusion". The Examiner offers no evidence of what constitutes sufficient evidence 
to support a substantial likelihood of utility in this case. The Examiner's contention that TRP is 
just one among many genes overexpressed "during progressive proliferation of cancerous cells" 
and cannot be quantitatively analyzed is contradicted by applicants disclosure that the gene 
expression is in fact undetectable in the normal tissues relative to the cancerous tissue, and is 
thus quantitatively distinguishable. 

IV. By Requiring the Patent Applicant to Assert a Particular or Unique Utility, the 
Patent Examination Utility Guidelines and Training Materials Applied by the 
Patent Examiner Misstate the Law 

There is an additional, independent reason to overturn the rejections: to the extent the 
rejections are based on Revised Interim Utility Examination Guidelines (64 FR 71427, 
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December 21, 1999), the final Utility Examination Guidelines (66 FR 1092, January 5, 2001) 

and/or the Revised Interim Utility Guidelines Training Materials (USPTO Website 

www . uspto.gov, March 1, 2000), the Guidelines and Training Materials are themselves 

inconsistent with the law. 

The Training Materials, which direct the Examiners regarding how to apply the Utility 

Guidelines, address the issue of specificity with reference to two kinds of asserted utilities: 

"specific" utilities which meet the statutory requirements, and "general" utilities which do not. 

The Training Materials define a "specific utility" as follows: 

A [specific utility] is specific to the subject matter claimed. This contrasts to general 
utility that would be applicable to the broad class of invention. For example, a claim to a 
polynucleotide whose use is disclosed simply as "gene probe" or "chromosome marker" 
would not be considered to be specific in the absence of a disclosure of a specific DN A 
target. Similarly, a general statement of diagnostic utility, such as diagnosing an 
unspecified disease, would ordinarily be insufficient absent a disclosure of what condition 
can be diagnosed. 

The Training Materials distinguish between "specific" and "general" utilities by assessing 
whether the asserted utility is sufficiently "particular/' i.e., unique (Training Materials at p.52) as 
compared to the "broad class of invention." (In this regard, the Training Materials appear to 
parallel the view set forth in Stephen G. Kunin, Written Description Guidelines and Utility 
Guidelines , 82 J.P.T.O.S. 77, 97 (Feb. 2000) ("With regard to the issue of specific utility the 
question to ask is whether or not a utility set forth in the specification is particular to the claimed 
invention.")). 

Such '^unique" or "particular 1 * utilities never have been required by the law. To meet the 
utility requirement, the invention need only be "practically useful/* Natta, 480 F.2d 1 at 1397, 
and confer a "specific benefit" on the public. Brenner. 383 U.S. at 534. Thus, incredible "throw- 
away" utilities, such as trying to "patent a transgenic mouse by saying it makes great snake food," 
do not meet this standard. Karen Hall, Genomic Warfare , The American Lawyer 68 (June 2000) 
(quoting John Doll, Chief of the Biotech Section of USPTO). 

This does not preclude, however, a general utility, contrary to the statement in the 
Training Materials where "specific utility" is defined (page 5). Practical real-world uses are not 
limited to uses that are unique to an invention. The law requires that the practical utility be 
"definite/" not particular. Montedison, 664 F.2d at 375. Appellant is not aware of any court that 
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has rejected an assertion of utility on the grounds ttiat it is not "particular" or "unique" to the 
specific invention. Where courts have found utility to he tot) "general," it has been in those cases 
in which the asserted utility in the patent disclosure was not a practical use that conferred a 
specific benefit. That is, a person of ordinary skill in the art would have been left to guess as to 
how to benefit at all from the invention. In Kirk, for example, the CCPA held the assertion that a 
man-made steroid had "useful biological activity" was insufficient where there was no informa- 
tion in the specification as to how that biological activity could be practically used. Kirk, 376 
F.2d at 941. 

The fact that an invention can have a particular use does not provide a basis for requiring 
a particular use. See Brana, supra (disclosure describing a claimed antitumor compound as 
being homologous to an antitumor compound having activity against a "particular" type of cancer 
was determined to satisfy the specificity requirement). ^Particularity" is not and never has been 
the sine qua nan of utility; it is, at most, one of many factors to be considered. 

As described supra, broad classes of inventions can satisfy the utility requirement so long 
as a person of ordinary skill in the art would understand how to achieve a practical benefit from 
knowledge of the class. Only classes that encompass a significant portion of nonuseful members 
would fail to meet the utility requirement. Supra § II. B. 2 (Montedison, 664 F.2d at 374-75). 

The Training Materials fail to distinguish between broad classes that convey information 
of practical utility and those that do not, lumping all of them into the latter, unpatentable category 
of "general'" utilities. As a result, the Training Materials paint with too broad a brush. Rigorous- 
ly applied, they would render unpatentable whole categories of inventions that heretofore have 
been considered to be patentable and that have indisputably benefitted the public, including the 
claimed invention. See supra § II. B. Thus the Training Materials cannot be applied consistently 
with the law. 

V. To the Extent the Rejection of the Patented Invention under 35 U.S.C. § 112, First 
Paragraph, Is Based on the Improper Rejection for Lack of Utility under 35 U.S.C. 
§ 101, it Must Be Reversed. 

The rejection set forth in the Office Action is based on the assertions discussed above, 
i.e., that the claimed invention lacks patentable utility. To the extent that the rejection under 
§ 1 12, first paragraph, is based on the improper allegation of lack of patentable utility under 
§ 101, it fails for the same reasons. 
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(10) CONCLUSION 

Appellants respectfully submit that rejections for lack of utility based, inter alia, on an 
allegation of "lack of specificity," as set forth in the Office Action and as justified in the Revised 
Interim and final Utility Guidelines and Training Materials, are not supported in the law. Neither 
are they scientifically correct, nor supported by any evidence or sound scientific reasoning. 
These rejections are alleged to be founded on facts in court cases such as Brenner and Kirk, yet 
those facts are clearly distinguishable from the facts of the instant application, and indeed most if 
not all nucleotide and protein sequence applications. Nevertheless, the PTO is attempting to 
mold the facts and holdings of these prior cases, 'like a nose of wax/' to target rejections of 
claims to polypeptide and polynucleotide sequences where biological activity information has not 
been proven by laboratory experimentation, and they have done so by ignoring perfectly 
acceptable utilities fully disclosed in the specification as well as well-established utilities known 
to those of skill in the art. As is disclosed in the specification, and even more clearly, as one of 
ordinary skill in the art would understand, the claimed invention has well-established, specific, 
substantial and credible utilities. The rejections are, therefore, improper and should be reversed. 

Moreover, to the extent the above rejections were based on the Revised Interim and final 
Examination Guidelines and Training Materials, those portions of the Guidelines and Training 
Materials that form the basis for the rejections should be determined to be inconsistent with the 
law. 

Due to the urgency of this matter, including its economic and public health implications, 
an expedited review of this appeal is earnestly solicited. 
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If the USPTO determines that any additional fees are due, the Commissioner is hereby 
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APPENDIX - CLAIMS ON APPEAL 

1. An isolated cDNA comprising a nucleic acid sequence encoding a protein having 
the amino acid sequence of SEQ ID NO:l, or the complement thereof. 

2. An isolated cDNA comprising a nucleic acid sequence selected from: 
a) SEQ ID NO: 2 or the complement thereof; 

h) a fragment of SEQ ID NO:2 selected from SEQ ID NOs:3-4 or the complement 
thereof: and 

c) a variant of SEQ ID NO:2 selected from SEQ ID NOs:6- 1 1 or the complement 
thereof. 

3. A composition comprising the cDNA or the complement of the cDNA of claim 1 
and a labeling moiety. 

4. A vector comprising a cDNA encoding an amino acid sequence of SEQ ID NO: 1. 

5. A host cell comprising the vector of claim 4. 

6. A method for using a cDNA to produce a protein, the method comprising: 

a) culturing the host cell of claim 5 under conditions for protein expression; and 

b) recovering the protein from the host cell culture. 
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1. An important feature or the work of manv moiecuiir bioioeisu it laentifving ^ nicn 
genes ire switched on and off in a cell under different environmental conditions or 
subsequent to xenobionc challenge. Such information has man\ uses, including tne 
deciphering of molecular pathwavs and facilitating the development o: new experimental 
and diagnosnc procedures However, the student of gene hunting snouid be torgiven for 
perhaps becoming confused by the mountain of information avuiibie -*s tnere appears to be 
almost as many methods of discovering differentially expressed ger.es as mere ire research 
groups using the technique. 

2. The aim of this review was to clarify the main methods of dirTerenttai gene expression 
anaJvsis and the mechanistic principles underlying tnem. Also inciuaed is a discussion on 
some of the practical aspects of using this technique Empnasis ts piacea on the so-called 

open ' svstems. which require no prior knowledge of the genes contained within the study 
model. Whilst these will eventually be replaced by closed ivstems m tne srudv of human, 
mouse and other commonly srudied laboratory animals, thev will remain a powerful tool for 
those examining leas fashionable models 

3. The use of suppresston-PCR subtractive hybridization is exemplified in the 
identification of up- and down -regulated genes in rat liver following exposure to pheno- 
barbetal, a well-known inducer of the drug metabolizing enzymes. 

4. Differential gene display provides a coherent platform for building libraries and 
microchip arrays of gene fingerprints characteristic of known enrvme inducers and 
xenobiotic toxicants, which may be interrogated subsequently for tne identification and 
characterization of xenobiotics of unknown biological properties 
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Introduction 

It is now apparent that the development of almost all cancers and many non- 
neoplastic diseases are accompanied by altered eene expression :n the affected cells 
co m p are d to their normaJ state tHuntcr 1991, W.yruord- Thomas 1991, Yogelstein 
and Kinzler 1 993 , Semenza 1 994, Cassidy 1 995 . Klemjan and Van Hetmineen 1 998 V 
Such changes also occur in response to external stimuli such as pathogenic micro- 
organisms (Rohn et at. 1996, Singh et al. 1997, Griffin and Krishna 1998, Lunney 
1998) and xenobiotics (Sewall et al. 1995. Dogra et al. 1998, Ramana and Kohh 
1998), as well as during the development of undifferentiated cells (Hecht 1998, 
Rudin and Thompson 1998, Schneider-Maunoury et al. 1998). The potential 
medical and therapeutic benefits of understanding the molecular changes which 
occur in any given cell in progressing from the normal to the ' altered ' state are 
enormous. Such profiling essentially provides a '.fingerprint ' of each step of a 
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"I. ; aevelopmer.: or rtsponse and snould help a the eJuc:aa:>or. of specific a-c 

5cnsit ' v * b.omarKers representing, tor example. CirTrren: types o: cancer or previous 

exposure to certain classes of chemicals that are e.-.xyme inducers 

In dmg metabolism, many of the xenobiotiometaboiizing er.rvrr.es . inciudine 
the weh-cnaractenzed isoforms of cytochrome P450) are incucsbie ov d- u « and 
cnem.cals ,n man (Pelkonen et al. 1998), predommantlv mvoiv.ne :ransc-"-.onal 
activation of not only the cognate cytochrome P450 genes, bu: addit.onai cVliuia- 
proteins un.cn may be crucial to the phenomenon of induction. Accordirg'v -he 
aevdopment ot methodology to identify and assess the full compicme- o"' «™ 
tnat are either up- or down-elated- by inaucer, are crucial ,n the c^lop^ 0 . 
knowlec« to understand tr. 3 rec,se molecular mecnan.sms of er.zvme ~au--!o- 
and nou :m » relates to drug action. Similarly. , n the held of cre-iica^nauce- 1 
toxicity, ,t is now becom:-- increasingly obvious that most adverse reactions to 
arugs and cnem.cals are tne result of multiple gene regulation, some of wh.cn are 
causal and some of which are casually-related to the toxicoiog.cai pnenomenon per 
^This observation has led to an upsurge m interest ,n gene-prohi.ng technolog.es 
wmcn omerentiate oerween the control and toxin-treated gene pools in target tissues 
and ,s tneretore. of value m rationalizing the molecular mechanisms of xenob.ot.c- 
mduced toxicity. Knowledge of toxin-dependent gene regulation in tareet tissues ,s 
not solely an academic pursuit as much interest has been generated in the 
pnarmaceuticai industry to harness this technology ,„ the earlv .aent.hcanon of toxic 
drug candidates thereby shortening the developmental process and contributing 
substantially to the safety assessment of new drugs. For example. ,f the gene profile 
in response to say a testicular toxin that has been well-characterized in mo could be 
determined in the testis, then this profile would be representative of all new drug 
candidates which act v, a this specific molecular mechanism of tox.c.rv thereof- 
providing a useful and coherent approach to the early detection of such' toxicants. 
Whereas » would be informative to know the .dentin- and functionality of all genes 
up/down regulated by such toxicants, this would appear a longer term goal, as the 
majority of human genes have not yet been sequenced, far less their functionality 
determined. However, the current use of gene pronlinc v.elds a pattern of gene 
changes for a xenobiotic of unknown tox.c.ry which may be matched to that of well- 
characterized toxins, thus alerting the tox.colog.st to possible ,n vivo similarities 
between the unknown and the standard, thereby providing a platform for more 
extexwv* toxicologic*! examination. Such approaches are beginning to gam 
momentum, m that several b.oiechnoiogy companies are commercial producing 
gene chip, or "gene arrays :nat may be interrogated for tox.c.rv assessment of 
xenob.oncs. These chips consist of hundreds/thousands of genes, some of which are 
degenexa^in the sense that not all of the genes are mechanist.callv-reiated to any 
one toxicologic*! phenomenon. Whereas tnese chips are useful ,n broad-spectrum 
screening, they are maturing at a substantial rate, in that gene arravs are now 
becoming more specific, e.g. chip, for the identification of changes in growth factor 
families that contribute to the aetiology and development of chemically. induced 
neoplasias. 

Although documenting and explaining-theft genetic changes presents a 
formidable obstacle to understanding the different median: tm of development and 
disease progression, the technology >, now avwbble-to beg: tempting this difficult 
challenge Indeed, several -differential expression ana. , methods have been 
developed which facilitate the idennfication of gene products that demonstrate 
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altered expression in cells of one population compared to anotner These merr.ee- 
have been used to identify differential gene expression in manv situations, mciucir.c 
invading pathogenic microbes (Zhao et al. 1 998), m cells responding to extraceiiuiar 
and intracellular microbial invasion (Duguid and Dinauer 1990. Raer^o et al 
Maldarelh el al. 1998). in chemically treated cells (Syed et ai Rockett e: al 

1999), neoplastic cells (Liang et al. 1992. Chang and Terzaehi-Howe 1998). 
activated cells (Gurskaya et al. 1996. Wan et al. 1996). differentiated ceiis iHara et 
al. 1991. Guimaraes et al. 1995a. b). and different ceil types i Davis et al. 1084. 
Hednck et al. 1984. Xhu et al. 1998). Although differential expression anaivsis 
technologies are applicable to a broad range of models, pernaps meir most important 
advantage is that, in most cases, absolutely no prior knowledge of the specinc genes 
which are up- or down- regulated is required. 

The field of differential expression analvsis is a large ana complex one. with 
many techniques available to the potential user. These can be categorized into 
several methodological approaches, including: 

(1) Differential screening, 

(2) Subtractive hybridization (5H) (includes methods such as chemical cross- 
linking subtraction — CCLS, suppression-PCR subtractive hybridization— 
55H. and representational difference analysis — RDA). 

(3) Differential display (DD), 

(4) Restriction endonuclease facilitated analysis (including serial analysis of gene 
expression — SAGE — and gene expression fingerprinting— GEF), 

(5) Gene expression arrays, and 

(6) Expressed sequence tag (EST) analysis. 

The above approaches have been used successfully to isolate differentially 
expressed genes in different model systems. However, each method has us own 
subtle (and sometimes not so subtle) characteristics which incur various advantages 
and disadvantages. Accordingly, it is the purpose of this review to clarify the 
mechanistic principles underlying the mam differential expression methods and to 
highlight some of the broader considerations and implications of this very powerful 
and increasingly popular technique. Specifically, we will concentrate on the so- 
called 'open' systems, namely those which do not require anv knowledge of gene 
sequences and, therefore, are useful for isolating unknown genes. Two 'closed* 
svstems (those utilising previousiv identified gene sequences}. ESTanaJvsts and the 
use of DNA aiiavi, will user be cemsiaered bneflv for comoieteness. Whilst 
emphasis will often be placed on suppression PCR subtractive nvbndization (SSH, 
the approach employed in this laboratory), it is the aim of the authors to highlight, 
wherever possible, those areas of common interest to those who use. or intend to use. 
differential gene expression analysis. 



Differentia] cDNA library screening (DS) 

Despite the development of multiple technological advances which have recently 
brought the field of gene expression profiling to the forefront of molecular analysis, 
recognition of the importance of differential gene expression and characterization of 
differentially expressed genes has existed for many years. One of the original 
approaches used to identify such genes was described 20 years ago by St John and 
Davis (1979). These authors developed a method, termed 'differential plaque filter 
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hybridization', which was used to isolate gaiactose-inducibie DN'A sequences r'rorr. 
yeast. The theory is simple: a genomic DNA library is prepared rrc-r. normal' 
unstimulated cells of the test organism /tissue and multiple niter replicas are 
prepared. These replica blots are probed with rad.oact.velv (0 r otherwise, iaoelied 
complex cDN A probes prepared from the control and test cell rr.RN A populations 
I hose mRNAs which are differentially expressed m the treated cell poouianon w,ll 
snow a positive signal only on the niter probed with cDNA from the i-eated cells 
Funnermore. labelled cDNA from different test conditions car. oe usee :o prooe 
multiple blots, thereby enabling ike .dennficat.on of mRNAs which are oniv un- 
regulated under certain conditions. For example, St John and Dav.s i 1 570 > screened 
repuca filters with acetate-, glucose- and galactose-denved proses ,r. oraer to obtain 
genes induced specmcally by galactose metabolism. Although groundbreaking in ,ts 
time mis metnod is now considered insensitive and time-consuming as up to " 
months are required to complete the identification of genes which are differential 
expressed in tne test population. In addition, there is no convenient wiv to check 
that tne procedure has worked until the whole process has been completed 



Subtractive Hybridization (SH) 

The developing concept of differential gene expression and the success of earlv 
approaches such as that described by St John and Davis (1979) soon gave rise to a 
search for more convenient methods of analysis. One of the first to be developed was 
nume "»" variations of which have since been reported (see below). In general 
this approach involves hybridization of mRNA/cDNA from one population (tester) 
to excess mRNA/cDNA from another (driver), followed bv separation of the 
unhybndaed tester fraction (differennally expressed) from the hvbr.dized common 
sequences. This step has been achieved physically, chemicallv and through the use 
of selective polymerase chain reaction (PCR) techniques. 



Physical separation 

Original subtractive hybridization technology involved the phvsical separation 
of hybridized common species from unique single stranded species. 5? ;ral methods 
ot achieroig this have, been described, including nyaroxyapame chromatosrapnv 
(Sargent and Dawid 1983), avidm-bionn technology iDueuid and Dinauer 1990) 
and oligodT-latex separation (Hara et al. 1991). In the first approach, common 
mRNA species are removed by cDNA (from test cells)-mRNA (from control cells) 
subtractive hybridization followed by hydroxyapatite chromatography, as hvdroxy- 
apatite specifically adsorbs the cDNA-mRNA hybrids. The unabsorbed cDNA is 
then used either for the construction of a cDNA library of differential^ expressed 
genes (Sargent and Dawid 1983, Schneider et al. 1988) or directly as a probe to 
screen a preselected library (Zimmerman et al. 1980, Davis et al. 1984, Hednck et al. 
1984). A schematic diagram of the procedure is shown in figure 1. 

Less rigorous physical separation procedures coupled with sensitivity enhancing 
PCR steps were later developed as a means to overcome some of the problems 
encountered with the hydroxyapatite procedure. For example, Daguid and Dinauer 
(1990; described a method of subtraction utilizing biotin -affinity systems as a means 
to remove hybridized common sequences. In this process, both the control and 
tester mRNA populations are first convened to cDNA and an adaptor (' oligovector 
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Produce clones Label directly and prof* library 

Figure 1 The hvdroxvapatite method of subtracts hvortdixation cDNA denved from the 
treated / altered t tester l poouunon n rruzea with a iar*e excess or mRNA from the control tdnvcr» 
population. Following hvfendiranan. mRNA-cDNA hvbnds arc removed bv nvoxoxviDanie 
enromatognphv Tne oruv cDNA* wrucn remain are tnoae wmcn are otnerentiairv exprwseo. in 
the treated/ altered popuuaoon. In orocrto taciiitate tne recoverv ot full iengrn done*, small cDNA 
fragment* axe removed by exclusion cnromaiograpnv. T"he remaining cDNAi are then cloned into 
a vector for sequencing, or labelled and used directlv to probe a hbrarv. as described bv Sargent 
and Dawid (1983) 

containing a restriction site) ligated to both sides. Both populations are then 
amplified by PCR, but the dnver cDNA population is subsequently digested with 
the adaptor-containing restriction endonuclease. This serves to cleave the oligo* 
vector and reduce the amplification potential of the control population. The digested 
control population is then biottnylated and an excess mixed with tester cDNA. 
Following denaruration and hybridization, the mix is applied to a biocyxin column 
(streptavidin may also be used) to remove the "control population, including 
heteroduplexes formed by annealing of common sequences from the tester 
population. The procedure is repeated several times following the addition of fresh 
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*) mRNA conrrol cDNA. In order to runner enrich those species different uliv expressec :r. 

— AAAA r ^ e iestcr C DNA. the subtracted tester population ts amplified bv PCR folioums 

-AAAA ~ - every* second subtraction cycle. After six cycles of subtraction i three reamphhcation 

steps) the reaction mix is ligared into a vector for further anaivsis. 

In a slightly different approach, Hara et al. (1991) utilized a method wherebv 
oligo(dT M ) primers attached to a latex substrate are used to hrs: capture mRNA 
extracted from the control population. Following 1st strand cDNA svnthesis. the 
RNA strand of the heterodupiexes is removed by heat denaturation ana cer.tr:- 
j fugation (the cDNA-ohgotex-dT M forms a pellet and the supernatant is removed) 

A quannry of tester mRNA is then repeatedly hybridized to the immobilized control 
(driver) cDNA (which is present in 20-fold excess). After several rounds of 
hybridization the only mRNA molecules left in tne tester mRNA population are 
those which are not found^m the driver cDNA-oIigoiex-dT 30 population. These 
tester-specific mRNA species axe then convened to cDNA and. following tne 
; addition of adaptor sequences, amplified by PCR. The PCR products are then 

I hgated into a vector for funher anaivsis using restriction sites incorporated into the 

PCR primers. A schematic illustration of this subtraction process is shown in hgure 
—J 2. 

However, all these methods utilising physical separation have been described as 
inefficient due to the requirement for large stamng amounts of mRNA. significant 
loss of material during the separation process and a need for several rounds of 
hybridization. Hence, new methods of differential expression analysis have recently 
been designed to eliminate these problems. 
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Chemical Cross-Linking Subtraction fCCLS) 

In this technique, originally described by Hampson et al. (1992). driver mRNA 
is mixed with tester cDNA (1st strand only) in a ratio of > 20: 1. The common 
sequences form cDNA:mRNA hybrids, leaving the tester specific species as single 
stranded cDNA. Instead of physically separating these hybrids, they are inactivated 
chemically using 2,5 diazindmyl- 1 ,4-benzoquinone (DZQ). Labelled probes are 
then synthesized from the remaining single stranded cDNA species (unreacted 
mRNA species remaining from the driver are not converted into probe material due 
to specificity of Sequenase T7 DNA polvmerase used to make the probe) and used 
to acreeaa cDNA library made from rhe rwter cell population. A scnemanc diagram 
of the system is shown in hgure 3. 

It has been shown that the differentially expressed sequences can be enriched at 
least 300-fold with one round of subtraction (Hampson et al. 1992), and that the 
technique should allow isolation of cDNAs derived from transcripts that are present 
at less than 50 copies per cell. This equates to genes at the low end of intermediate 
abundance (see table 1). The main advantages of the CCLS approach are that it is 
rapid, technically simple and also produces fewer false positives than other 
differential expression analysis methods. However, like the physical separation 
protocols, a major drawback with CCLS is the large amount of starting material 
required (at least 10 ^g RNA). Consequently, the technique has recently been 
refined so that a renewable source of RNA can be generated. The degenerate random 
oligonucleotide primed (DROP) adaptation (Hampson et al. 1996, Hampson and 
Hampson 1997) uses random hexanucleotide sequences to prime solid phase- 
synthesized cDNA. Since each primer includes a T7 polymerase promotor sequence 
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iierjmRNA at the 5' end. the hnai pool of random cDN A -fragments is a PCR-renewabie cDNA 

AAAA population which is representative of the expressed gene pooi ar.d can be usee to 

AAAA synthesae sense RNA for use as driver material. Furthermore. ;f the final pooi of 

random cDNA fragments is reamphned using biotinviarec T" primer and random 

■ _ hexamer, the product can be captured with streptavidin beaas anc the antisense 

* strand eluted for use as tester. 5mce both target and driver can ne generated from 

— r ~ TT zn * same DROP product, subtraction can be performed in both erections u e lor 

~ T _ r , U P- and down-regulated species) between two different DROP products 

Representational Difference Analysts <'RDA/ 

RDA of cDNA {Hubank and Schatz 1994) is an extension of the technique 
originally applied to genomic DNA as a means of identifying differences between 
two complex genomes (Lisitsyn et aL 1993). It is a process of subtraction and 
amplification involving subtractive hybridization of the tester m the presence of 
excess driver. Sequences in the tester that have homoiogues m the driver are 
rendered unamplifiable, whereas those genes expressed only m the tester retain the 
ability to be amplified by PCR. The procedure is shown schematically m figure 4. 

In essence, the dnver and tester mRNA populations are nrst convened to cDN A 
and amplified by PCR following the ligation of an adaptor. The adaptors are then 
removed from both populations and a new (different) adaptor ligated to the 
amplified tester population only. Dnver and tester populations are next mehed and 
hybridized together m a ratio of 100: 1. Following hybridization, only tester: tester 
homohybnds have 5' adaptors at each end of the DNA duplex and can, thus, be filled 
in at both 3' ends. Hence, only these molecules are amplified exponentially during 
the subsequent PCR step. Although tester: driver heterohybnds are present, they 
only amplify in a linear fashion, since the strand derived from the driver has no 
adaptor to which the primer can bind. Dnver: driver heterohybrids have no 
adaptors and, therefore, are not amplified. Single stranded molecules are digested 
with mung bean nuclease before a further PCR-ennchment of the tester: tester 
homohybnds. The adaptors on the amplified tester population are then replaced and 
the whole process repeated a further rwo or three times using an increasing excess of 
dnver (Hubank and Shatz used a testendnver ratio of 1:400, 1:80000 and 
irh r sirmd icstcr 1 ; 800000 for the second, third and fourth hvbndizations. respectively) Different 

adaptors are ligated to the tester between successive rounds of hybridization and 
amplification to prevent the accumulation of PCR products that might interfere with 
subsequent amplifications. The final dispiav is a series of differentially expressed 
gene products easily observable on an ethidium bromide gel. 

t . mair > advantages of RDA are that it offers a reproducible and sensitive 

approach to the analysis of differentially expressed genes. Hubank and Schatz ( 1 994) 
reported that they were able to isolate genes that were differentially expressed in 
substantially less than 1 % of the cells from which the tester is derived. Perhaps the 
main drawback is that multiple rounds of ligation, hybridization, amplifiation and 
digestion are required. The procedure is, therefore, lengthier than many other 
Z1 " «b _ differential display approaches and provides more opportunity for operator-induced 

- — A error to occur. Although the generation of false* positives has been noted, this has 

been solved'to some degree by O'Neill and Sinclair ( 1 997) through the use of HPLC- 
purified adaptors. These are free of the truncated adaptors which appear to be a 
major source of the false positive bands. A very similar technique to RDA, termed 
linker capture subtraction (LCS) was described by Yang and Sytowski (1996). 
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Suppression PCR Sub tractive Hybridization . SSH , 

The most recent adaptation of the 5H approach to differentia] expression 
analvsis was first described by Diatchenko et al. (1996) and GursKava et a!, i lQvoi 
They reponed that a 1000-5000 fold enrichment of rare cDNAs i equivalent to 
isolating mRNAs present at oniy a few copies per cell) car. be oo:a:nec ui:nou; tne 
need for multiple hybridizations/subtractions. Instead of pnvsicai or cnemical 
removal of the common sequences, a PCR-based suppression system is used isee 
figure 5). 

In SSH, excess driver cDNA is added tcrrxco portions of the resrer cDNA wnicn 
have been ligated with different .adaptors. A first round of hybridization serves ro 
enrich differentially expressed genes and equalize rare and abuncan: messaees 
Equalization occurs since reanneahng is more rapid for abundan: moiecuies man tor 
rarer molecules due to the second order kinetics of hybridization \ james and Higgms 
1 985). The two primary hybridization mixes are then mixed together tn the presence 
of excess driver and allowed to hybridize further. This step permits the annealing of 
single stranded complementary sequences which did not hybridize in tne primary 
hvbridizanon, and in doing so generates templates for PCR amplication. Although 
there are several possible combinations of the single stranded molecules present in 
the secondary hybridization mix, only one particular combination (differentially 
expressed in rhe tester cDNA composed of complimentary stranos having different 
adaptors) can amplify exponentially. 

Having obtained the final differential display, rwo options are available if cloning 
of cDNAs is desired. One is to transform the whole of the final PCR reaction into 
competent cells. Transformed colonies can then be isolated and their inserts 
characterized by sequencing, restriction analysis or PCR. Alternatively, the final 
PCR products can be resolved on a gel and the individual bands excised, reamplified 
and cloned. The first approach is technically simpler and less time consuming. 
However, ligation /transformation reactions are known to be biased towards the 
cloning of smaller molecules, and so the final population of clones will probably not 
contain a representative selection of the larger products. In addition, although 
equalization theoretically occurs, observations in this laboratory suggest that this is 
by no means perfectly accomplished. Consequently, some gene species are present 
in a higher number than others and this will be represented in the final population 
of clones. Thus, in order to obtain a substantial proportion of those cene species that 
actually demonstrate differrnnai exDressiorrm the tester popuianon. the number of 
clones that will have to be screened after this step may be substantial. The second 
approach is initially more time consuming and technically demanding. However, it 
would appear to offer better prospects for cloning larger and low abundance gel 
products. In addition, one can incorporate" a screening step that differentiates 
different products of different sequences but of the same size (HA-staining, see 
later). In this way, a good idea of the final number of clones to be isolated and 
identified can be achieved. 

An alternative (or even complementary) approach Is to use the final differential 
display reaction to screen a cDNA library to isolate full length clones for further 
characterization, or a DNA array (see later) to quickly identify known genes. SSH 
has been used in this laboratory to begin characterization of the short-term gene 
expression profiles of enzyme-inducers such as phenobarbital (Rockett et al. 1997) 
and Wy- 14,643 (Rockett et al. unpublished observations). The isolation of 
differentially expressed genes in this manner enables the construction of a fingerprint 
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Figure 5- PCR-tetoct cDNA subtraction. In the pnmaryjiybndiwnon. an exec of dnvcr cDNA U 
widedto each tester cDNA population. The samples art heat denatured and allowed to hybndne 
or bcrwea i 3 tad 8 h. Thia aervea rwo purport : ( 1 ) to equaJue rare and abundant molecule. ; and 
(2) to ennch for differentially expressed sequence*— cDN A. that art not dtfertnuaJly expressed 
form type c molecuJea with the driver. In the iccondary hybndnation, the rw 0 pnmarr 
nybndixationa art mixed together without denaturing. Fresh denatured dnver can aiao be added 
at thus point to allow further enrichment of differentially expres.ed .equences. Type e molecule, 
are formed m thu aecondary hybnduanon wn.ch are .ubaequently amplified using rwo rounda of 
PCR. The final producu can be.v,»ualued on an agarw* geljabelled directly or cloned into a 
vector for downstream manipulation. Aa described by Diatchenko et al. (1996) and Curmkaya 
- « (1°*6). with perrruaaioa. 
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inducen. pheno barbital and Wy- 1 4.643. 

of expressed genes which are unique to each compound and time/dose point. Such 
information could be useful in short-term characterization of the toxic potential of 
new compounds by comparing the gene-expression profiles they elicit with those 
produced by known inducers. Figure 6 shows a flow diagram of the method used to 
isolate, verify and clone differentially expressed genes, and figure 7 shows expression 
profiles obtained from a typical SSH experiment. Subsequent sub-cloning of the 
individual bands, sequencing and gene data base interrogation reveals many genes 
which are either up- or down-regulated by phenobarbual in the rat (tables 2 and 3). 

One of the advantages in using the SSH approach is that no prior knowledge is 
required of which specific genes are up /down- regulated subsequent to xenobiouc 
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F:?Uf Vn e ^Lt!ui 1V m Ts e r ° bttm !f fr ° m f,t " Ver ,0ll0 - n « 3 - a - «„., «T.u„ 3 er 

pnenooaroiui. raRNA extracted rrom control and treated i.v.r. 

.nerent.a, du P Uv, uling m . PCR-Se.ec: cDNA^o™ " ^n" " .'IIT^ 
>.i— <Hj treatment, J— genes upreyuiatea following obenooaro.tal -^.n, ; 

exposure, and an almost complete complement of genes are obtained. For example 
the perox.some proh.er.tor and non-genotoxic hepatocarc.nogen Wv.]4 645 up" 
regulates at least28 genes and down-regulates at least 15 in the rat u sensitive 
sp.ee.) and produces 48 up. and 37 down-regulated genes ,n the guinea P g . 

One of n t V PeC,CS ( rnT' SWalCV Esda Gib50n - "npubluhed observations) 
One or these genes. CD81. was up-regulated ,n the rat and down-regulated ,n the 
gu.neap.g following VU-14,643 treatment. CD81 .alternator, named TAP.Vli 

orocet "^f lu^ Pr0tem Wh,Ch " VOi VCd 3 of cellu la 

a 9 «V ^ m' f?™' aCt ' Vat,0n ' Pr0hl ' eratl0n - d -nerennation (Lew „ 
o °[ lh " C fUnCt '° nS a " aitered 10 so ™ «te„, m tne phenomena 

Lblh^T" .r""^ 010 " 0 it ,s intneuin*. an d 

probably mechanisncally-reievant. that CDS', expression ,s differentiailv reflated 
■ r. a resistant and susceptible species. However, the down-s.de of this approach „ 
:nat the majonrv of genes can be sequenced and matched to database sequences but 
the latter are predominantly expressed sequence tags or genes of completely 
unknown runcuon. thus pamally obscunng a realistic overall assessment of the 
critical genes of genuine biological interest. Nojwnhjtanding the lack of complete 
funnona identification of altered gene express.on. such gene profiling studies 
essentially provides a 'molecular fingerprint' in response to xenob.ot.c challenge 
thereby serving as a mechanistically-relevant platform for further detailed 
investigations. — 

Differential Display <T>D) 

Originally describee , j • RNA fingerprinting by.acbitr.rily pnmed PCR ' (Liana 
and Pardee 1992) tbi. method is now more commonly referred to as • differential 
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F As i A'EMBL eenc :aentmcaiior 


5 ( 1 300) 




93.5 c o 


r vp^bi 


7 (1000) 




95.1 ° e 


P f^nrm ! H 1 1 rr\ i n 

l [C^JT UslUUIIlIil 








mj^i mri iiuu ill i *. m r\ , ^ 


8 f950) 




98.3 ° 0 


NCI-CGAP-Prl H jJ0irn;iE5Tl 


10 (850) 




95.7% 


CVP2B1 


1 1 (800) 
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94 9% 


CYP2B1 




Clone 2 


75.3 ° 0 


CYP2B2 


i: ("oi 




93.8% 


TRPM-: rriRNA 








Sulfated gtveoprotem 


15 (600) 




9:.9° e 


PreproaJbumtn 








Serom albumin mRNA 


16 (55) 


Clone 1 


95.: ° 0 


CYP2B1 




Clone 2 


93.6* 0 


Haptoglobuun mRNA partial aipha 


21 (350) 




99.3% 


185. 5.85 £ 2S5 rR.Na 



Bands 1-4. 6, 9. 13. 14. and 17-20 art ihown to be false positives bv aot blot ana vim and. therefore, 
are not sequenced. Denved from Rociten et al. (1997). It should be noted thai :ne atx>\ C ?ene» do not 
represent the complete spectrum ol genes which are up-reguiated tn rai liver pnenobarbital. bui 
simply represents the genes sequenced and identified to date. 



Table 3. Genes down -regulated in rat liver following 3-dav exposure to pnenobarbital. 



Band number 

(approximate Highest sequence 

sue in bp) similanry FASTA-EMBL gene identification 



1 (1300) 




95.3% 


3-oxoacvl-CoA thiolase 


2 (1200) 




92.3% 


Hemopoxin mRNA 


3 (1000) 




91.7% 


Alpha- 2u -globulin mRNA 
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Clone 1 


77 2 % 


.M.mus cuius CI inhibitor 




Clone 2 


94.5% 


Electron transfer rtavoprotetn 




Clone 3 


91.0% 


.\[ mus cuius Topoisomerase 1 (Topo I) 


8 (650) 


Clone 1 


36.9% 


Soares 2Nb\IT M muscului (EST) 




Clone 2 


96.2% 


Alpha-2u-globuhn iwvpe) mRNA 


9(600) 


Clone 1 


86.9% 


Soares mouse NML .V muscuius (EST) 




Clone 2 


82.0% 


5oares p3NMF 19 5 M muscuius (EST) 


10 (550) 




"3.8% 


Soares mouse NML M muscuius (EST) 


11 (525) 




95.7% 


NCI-CCAP-Prl H sapiens (EST) 


12 (373) 




100.0% 


RiboaomaJ protein 


13 (23) 


Clone 1 


97.2" . 


Soarrs moute rmono \b\lEl35 (EST' 




Clone 2 


100.0% 


Fibrinogen B-o*i»-cnain 




Clone 3 


100,0% 


Apoiipoprotein £ gene 


H (170) 




96.0% 


Soares p3NMF19 5 M muscuius (EST) 


15 (140) 




97.3% 


Srratagene mouse teitu (EST) 


Others: (300) 




96.7% 


R nor^tfxeus RASP 1 mRNA 


(275) 




93.1 % 


Soares mouse mammary' gland (EST) 



EST ■ Expressed sequence tag. Bands 4—6 were shown to be false positives bv dot blot analysis and. 
therefore, were not sequenced. Derived from Rock en et at. (1 997). It should be noted that the above grnes 
do not represent the complete spectrum of genes which are down-regulated in rat liver by phenobarbiial. 
but aunipiy repreaenu the genes sequenced and ldenu&ed to date. 



display ' (DD). In this method, all the rnRN A species in the control and treated cell 
populations are amplified in separate reactions using reverse transcnptaae-PCR 
(RT-PCR). The products are then run side-by-side on sequencing gels. Thoae 
bands which are present m one display only, of- which axe much more intense in one 
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display compared ro the other, are differentially expressed and mav se recovered :c: 
further characterization. One advantage of this system is the speed w::h u- t - car. 
be carried out— 2 days to obtain a display and as hnle as a week to ma*e 2nd ioen:;:\ 
clones. 

Two commonly used vananons are based on different memoes of -nmmc the 
reverse transcription step (figure 8). One is to use an ohgo cT with a 2-base anchor 
at the 3"-end. e.g. 5' (dT n )CA 3' (Liang and Pardee 1QC-. Alternately, an 
arbitrary primer may be used for 1st strand cDNA svnthesis (Weish e: a: !QC;. 
This variant of RNA hngerpnnting has also been called ' RAP' { RNA Aroitranh 
Pnmed)-PCR. One advantage of this second approach is that PCR products rr.a\ oe 
derived from anywhere m the RNA, including open reading frames In addition. 
can be used for mRNAs that are not polyadenylated. such as many bacterial mRNAs 
1 Wong and McClelland 1994). In both cases, following reverse transcription and 
denaturation, second strand cDNA synthesis is carried out with an aroitrarv primer 
iaroitrary pnmers have a single base at each position, as compared to random 
primers, wnich contain a mixture of all four bases at each position). The resulting 
PCR, thus, produces a series of products which, depending on the svstem primer 
length and composition, polymerase and gel svstem). usually includes 50-100 
products per primer set (Band and 5ager 1989). When a combination of different 
dT-anchors and arbitrary pnmers are used, almost all mRN A species from a cell can 
be amphned. When the cDNA products from two different populations are analysed 
side by side on a polyacrylamide gel. differences in expression can be identified and 
the appropriate bands recovered for cloning and further anaivsis. 

Although DD is perhaps the most popular approach used todav for identifying 
differentially expressed genes, it does suffer from several perceived disadvantages: 

(1) It may have a strong bias towards high copy number mRNAs (Bemoli et al. 
1995). although this has been disputed (Wan et al. 1996) and the isolation of very- 
low abundance genes may be achieved in certain circumstances (Guimeraes et 
al. 1995a). 

(2) The cDNAs obtained often only represent the extreme 5 end of the mRNA 
(often the 3 '-untranslated region), although this may not always be the case 
(Guimeraes et al. 1 995a). Since the 3' end is often not included in Genbank and 
shows variation between organisms. cDNAs identified bv DD cannot alwavs be 
matched warn their genes, even if thev have Deen identified. 

i3) Tne pattern of differential expression seen on tne display orten cannot be 
reproduced on Northern blots, with false positives arising in up to 70 ° 0 of cases 
(Sun et al. 1994). Some adaptations have been shown to reduce false positives, 
including the use of rwo reverse transcriptases (Sung and Denman 1997), 
comparison of uninduced and induced celts over a time course (Bum et al. 1994) 
and comparison of DDPCR-producu from rwo uninduced and two induced 
lines (Sompayrac et al. 1995). The laner authors also reported that the use of 
cytoplasmic RNA rather then total RNA reduces false positives arising from 
nuclear RNA that is not transported to the cytoplasm. 

Further details of the background, strengths and weaknesses of the DD 
technique'ean be obtained'from a review T>y McClelland et al. (1996) and from 
articles by Liang et al (1995) and Wan et al. (1996)7" ~ 
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cDNA can now be amplified by PCR using ongmal onmer sair 

Figure 8. Two approaches to differential display (DD) analvsis. 1" strand svnthesis can be carried out 
either with a polydT u NN pnmer \ where N « C . C or A) or with an arb.tran pnmer. The use ot 
different combination! of C. C and A to anchor the first strand polydT pnmer enable* the pnrnmg 
of the majonry of polyadenvlated mRNAs. Arbitral primers mav hvbndue at none, one or more 
place* aionr the ienph of the mRNA. allowing 1" strand cDNA svnthesit to occur at none, one 
or more point* in the same gene. In both cases. 2*" strand svnthens is carried out with an arbitrary 
pnmer. Since these arbitrary pnmer* for the :** strand mav also hybridize to the I" strand cDNA 
in a number of different places, several different 2" strand producti mav be obtained from one 
binding point of the I" strand pnmer. Following strand synthesis, the original set of pnmer* 
is uaed to ampnrv tne second strand products, ^tn :he result mat numerous aene sequence* are 
imp iined . 



Restriction endonuclease-facilitated analysis of gene expression 

Serial Analysis of Gene Expression (SAGE) 

A more recent development in the field of differential display is SAGE analysis 
(Velculescu et al. 1 995). This method uses a different approach to those discussed so 
far and is based on two principles. Firstly, in more than 95 ° 0 of cases, short 
nucleotide sequences ('tags-') of- only- nine or 10 base pairs provide sufficient 
information to identify their gene of origin. Secondly, concatenation (linking 
together in a series) of these tags allows sequencing of multiple cDNAs within a 
single clone. Figure 9 shows a schematic representation of the SAGE process. In this 
procedure, double stranded cDNA from the test cells is synthesized with a 

biotinylated polydT primer. Following ^digestion with a commonly cutting (4bp 

recognition sequence) restriction enzyme C anchoring enzyme *). the 3' ends of the 

- - - cDNA. population are captured with «reptav4din beads. The captured population is 
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spii: into two and different adaptors hgated to the 5 ends of each srour Incorporate- 
mto the adaptors is a recognition sequence for a rvpe IIS restriction enrvme— one 
-hicrr cuts DNA at a defined d.sunce ( < 20 bp. from ,ts recognition sequence 
Hence, rollowmg digestion of each caprured cDNA population « tne 1 1 > e~*v-ne 
the adaptors plus a shon p.ece of the captured cDNA are released. The' two 
populations are then hgated and the products amplified. The amsiirsed products are 
cleaved w,tn tne original anchonng enzyme, rehgated (concatomers arc formed m 
the process) and cloned. The advantage of this system ,s thai hundreds of ? e-e -a 8 s 
can oe laenttned by sequencing only a few clones. Furthermore, the number" or times 
a given transcript is identified is a quantitative measurement of - nat Be - t s 
abundance in tne original population, a feature which facilitates ice.-.ti.-.cation of 
dinerentially expressed genes m different cell populations. 

Some disadvantages of SAGE analysis include the technical difficulty of -he 
metnod. a large amount of accurate sequencing is required, biased towards abundant 
mR.N As, has not been validated m the pharmaeo/toxicogenomic setting and has 
only been used to examine well known tissue differences to date 



Gene Expression Ftngerpnnttng ' GEF, 

A different capture /restriction digest approach for isolating differentially 
expressed genes has been described by Ivanova and Belvavskv (1995). In this 
method. RNA is converted to cDNA using biotinylated ol.go.dT) primers. The 
C PoP«i"ion is then digested with a specific endonuclease and captured with 

magnetic streptavidm microbeads to facilitate removal of the unwanted 5' digestion 
proaucts. The use of restricted 3-ends alone serves to reduce the complexity of the 
cDNA fragment pool and helps to ensure that each RNA spec.es is represented bv 
not more than one restriction product. .An adaptor :s ligated to facilitate subsequent 
amplification of the caprured population. PCR is carried out w.th one adaptor, 
specific and one biotinylated polydT primer. The reampi.fied population ,s 
recaptured and the non-biotmylated strands removed bv alkaline dissociation. The 
non-bionnylated strand is then resynthes.zed using a different adaptor-specific 
primer in the presence of a radiolabeled dNTP. The labelled immob.lized 3' cD N * 
ends are next sequennaJly treated with a series of different restriction endonucleases 
ana the products irom each digestion analysed by PAGE. The result is a fingerprint 
composed oi a number of ladoers , equal to the numoer of seouer.aai digests used) 
By comparing test versus control fingerprints, it ,s possible to .dentin- differenuallv 
expressed proaucts wh.ch can then be .solated from the gel and cloned The 
advantages of this procedure are that ,t ,s very robust and reproducible and the 
author, estimate that 80-93% of cDNA molecules are involved in 'the final 
fingerprmt. The disadvantage is that polyacrylarr.ide gels can rarelv resolve more 
than 300-400 bands, which compares poorly to" the 1000 or more which are 
estimated to be produced in an average experiment. The use of 2-D gels such as 
those described by Umerlinden etal. (1989) and Harada et al. (1991) may help to 
overcome this problem. 

A similar method for displaying restriction endonuclease fragments was later 
described, by Prashax_and_; Weissman (19 96): Ho wever, instead of sequential 
digestion of the immobolixed 3;>terminal.cDNAJr£gments, these au-.nors simply 
compared the profiles of. the control and -treated-populanons without further 
manjpuianon. 
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Figure 9. SenaJ analysis of gene expression (SAGE) analysis. cDNA it cleaved with an anchoring eruyme 
f AE) and the 3'enda captured using streptavidin beads. ThecDNA pool u divided in half and each 
poroon ligatcd to a different linker, each containing t type IIS restriction site fugging enzyme, 
TE). Restriction with the type IIS enzyme releases the linker plus • short length of cDNA 
(XXXXX and OOOOO indicate nucleotides of different tags). The rw 0 pools of tafs are then 
Ugated and amplified using linker-specific pnmerv Following PCR, the products are cleaved with 

the AE and the dtttgt isolated from the linkers using PACE The dings are then ligmted (during 

which process, concatenixaaon occurs) and cloned into a vector of choice for sequencing. After 
VeJculeacu *t al. (199S), with permission." " . 
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of all human genes (Hillier et al. 1996). This large number or rreeK :\a; ; j D)r 
sequences (both sequence information and clones are normally available rovalr\ -rrer 
from the ongmarors) has enabled the development of a new approacn touarcs 
differential gene expression analysis as described by Yasmatzis e: al. tl°^8) The 
approach is simple in theory*: EST databases are hrs: searcnec tor genes :na: have a 
number of related EST sequences from the target tissue of choice, bur none or few 
from non- target tissue libraries. Programmes to assist m the assembly of such sets of 
overlapping data may be developed in-house or obtained privately or from :ne 
internet. For example, the Institute for Genomic Research iTIGR. found a: 
htrp://www.tigr.org) provides many software tools free of charge to tne scientihc 
community. Included amongst these is the TIGR assembler i Sutton e: al 1 9c; d 
tool for the assembly of large sets of overlapping data such as EST*. bac:er:ai 
artificial chromosomes (BAC)s, or small genomes. Candidate EST clones repre- 
senting different genes are then analysed using RNA blot methods for size anc tissue 
specificity and, if required, used as probes to isolate and identifv the full length 
cDNA clone for further characterization. In practice however, the method is rarher 
more involved, requiring bioinformatic and computer analysis coupled with 
confirmatory' molecular studies. Vasmatzis et aL (1998) have described several 
problems in this fledgling approach, such as separating highly homologous 
sequences derived from different genes and an overemphasis of specificity for some 
EST sequences. However, since these problems will largely be addressed by the 
development of more suitable computer algorithms and an increased completeness 
of the EST database, it is likely that this approach to identifying differentially 
expressed genes may enjoy more patronage in the future. 



Problems 2nd potential of differential expression techniques 
The holistic or single cell approach ? 

When working with in vivo models of differential expression, one of the first 
issues to consider must be the presence of multiple cell types in any given specimen. 
For example, a liver sample is likely to contain not only hepatoevtes, but also 
(potentially) ho cells, bile ductule cells, endothelial cells, various immune cells (e.g. 
lymphocytes, macrophages and Kupffer cells) and hbroblasts. Other tissues will 
earn nave tneir own distinctive ceil oooujanons. Aiso. m tne case ui neoplastic tissue, 
mere are almost aiwavs normal, nvperpiastic ana /or avsoiastic ceils present in a 
sample. One must, therefore, be aware that genes obtained from a differential 
display experiment performed on an animal tissue model may not necessarily arise 
exclusively from the intended 'target' cells, e.g. hepatoevtes /neoplastic cells. If 
a PPr°pnate, further analyses using immunohistochernistry, tn situ hybridization or 
in ntu RT-PCR should be used to confirm which cell types are expressing the 
gene(s) of interest. This problem is probably most acute for those studying the 
differential expression of genes in the 'development- of different cell types, where 
there is a need to examine homologous cell populations. The problem is now being 
addressed at the National Cancer 1 nstirate (Bethesda, MD, USA) where new micro- 
disection techniques have been employed to assist m their gene analysis programme, 
the Cancer Genome Anatomy Project (CGAP) (For more information see web site : 
hrtp ://www. ncbi.nlm.nih.gov/ncicgap/intro.html). There are also separation tech- 
nfques available that utilise cell-specific antigens'as a means to isolate target cells, 
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species present at less than 1.2% of the total mRNA population — equivalent to an 
intermediate or abundant species. Interestingly, when simpie moaei svstems i smgie 
target only) were used instead of a heterogeneous mRNA population, tne same 
pnmers could detect levels of target mRNA down to 10000 x smaller These results 
are probably best explained by competition for substrates from tne man\ PCR 
products produced in a DD reaction. 

The numbers of differentially expressed mRNAs reported in tne literature using 
various model systems provides further evidence that manv differentially expressed 
mRNAs are not recovered. For example. DeRisi et al. ^ 1 9Q7> usee DNA arrav 
technology to examine gene expression in yeast following exhaustion of sucar m tne 
medium, and found that more than 1700 genes showed a change in expression of a: 
least 2-fold. In light of such a finding, it would not be unreasonable to suggest tnat 
of the 8000-15 000 different mRNA species produced by any given mammalian cell, 
up to 1000 or more may show altered expression following chemical stimulation 
Whilst this may be an extreme figure, it is known that at least 100 genes are 
activated/upregulated m Jurkat (T-) cells following IL-2 stimulation (Ullman et al. 
1990). In addition. Wan et al. (1996) estimated that interferon-y-snmulated HeLa 
cells differentially express up to 433 genes (assuming 24000 distinct mRNAs 
expressed by the cells). However, there have been few publications documenting 
anywhere near the recovery of these numbers. For example, in using DD to compare 
normal and regenerating mouse liver. Bauer et aL (1993) found only 70 of 38000 
total bands to be different. Of these. 50% (35 genes) were shown to correspond to 
differentially expressed bands. Chen et al. (1996) reported 10 genes upregulated in 
female rat liver following ethinyl estradiol treatment. McKenzie and Drake (1997) 
identified 14 different gene products whose expression was altered by phorbol 
myristate acetate (PMA, a tumour promoter agent) stimulation of a human 
myelornonocytic cell line. Kilty and Vickers (1997) identified 10 different gene 
products whose expression was upregulated in the peripheral blood leukocytes of 
allergic disease sufferers. Linskens et al. (1995) found 23 genes differentially 
expressed between young and senescent fibroblasts. Techniques other than DD 
have also provided an apparent paucity of differentially expressed genes. Using SH 
for example, Cao et aL (1997) found 15 genes differentially expressed in colorectal 
cancer compared to normal mucosal epithelium. Fitzpatnck et al. ( 1 995) isolated 1 7 
genes upregulated in rat liver following treatment with the Deroxisome proliferator. 
cionbrate; Philips et aL f 1990) isolated 12 cDNA clones which were unregulated in 
nighiy metastatic mammary aaenocrrrcmoma cell lines compared to poorly meta- 
static ones. Prashar and Weissman (1996) used 3' restriction fragment analysis and 
identified approximately 40 genes showing altered expression within 4 h of 
activation of Jurkat T-cells. Groemnk and Leegwater (1996) analysed 27 gene 
fragments isolated using 5SH of delayed early response phase of liver regeneration 
and found only 12 to be upregulated. 

In the laboratory, SSH was used to isolate up to 70 candidate genes which appear 
to show altered expression in guinea pig liver following short-term treatment with 
the peroxisome proliferator. WY- 14,643 (Rockert, Swales, Esdaile and Gibson, 
unpublished observations). However, these findings have still to be confirmed by 
analysis of the extracted tissue mRNA for differential expression of these sequences. 
Whilst the latest differential display technologi cT BT e pu rported to include design 
_ and experimental modifications to overcome £tu4_kck oi^fftciency (in both the total 
number of differentially expressed genes recovered and the percentage that are true 
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berween rwo samples. Wan et al. (1996) reportec that aifferences m expression o: 
twofold or more are detectable using DD 



Resolution and visualization of differential expression products 

It seems highly improbable with current technology ma: a g e: svstem couic be 
developed that is able to resolve all gene species showing auercc egression ^ Jm 
given test system (be it SH- or DD-based). Poiyacryiamide re: electrophoresis 
(PAGE) can resolve size differences down to 0.2% tSamDrooK e: j. !0S9i anc jre 
used as standard m DD experiments. Even so, it is clear that a corr.p icx series or gene 
products such as those seen in a DD will contain unresoivabie components Thus 
what appears to be one band in a gel may m fact rurn out to be several. Indeed it has 
oeen well documented (Mathieu-Daude et al. 1996, Smith et ai 1957) mat a single 
band extracted from a DD often represents a composite of heterogeneous products 
andjhe same has been found for SSH displays in this laboratory ^Rockett et al. 
199/). One possible solution was offered by Mathieu-Daude et al (1996) who 
extracted and reamplined candidate bands from a DD display and used single strand 
conlormation polymorphism (SSCP) analysis to conhrm umch components 
represented the truly differentially expressed product. 

Many scientists often try- to avoid the use of PAGE where possible because it is 
technically more demanding than agarose gel electrophoresis i AGE). Unfortunately 
high resolution agarose gels such as Metaphor (FMC. Lichheld. UK) and AquaPor 
HR (National Diagnostics. Hessle, UK), whilst easier to prepare and manipulate 
than PAGE, can only separate DNA sequences which diner in size bv around 
l.^-2° 0 (15-20 base pairs for a 1Kb fragment). Thus. SSH. RDA or other such 
products which differ in size by less than this amount are normally not resolvable. 
However, a simple technique does in fact exist for increasing the resolving power of 
AGE— the inclusion of HA-red ( 10-phenyl neutral red-PEG heand) or HA-vellow 
(bisbenzamide-PEG ligand) (Hanse Analytik GmbH. Bremen, Germanv) m a 
gel separates identical or closely sized products on base content. Specifically. 
HA-red and -yellow selectively bind to GC and AT DNA motifs, respectively 
■ Wawer et ai 1995. Hanse Anaiyrik 1 Q 97. personal communication i. Since both 
HA-stains possess an overall positive cnaree. tney migrate towards tne catnooe 
when an electric field is appued. This is in airect opposition to DNA. which 
is negatively charged and, therefore, migrates towards the anode. Thus, if two 
DNA clones are identical m size (as perceived on a standard high resolution 
agarose gel), but differ in AT/GC content, inclusion of a HA-dye in the gel 
will effectively retard the migration of one of the sequences compared to the 
other, effectively making it apparently larger and. thus, providing a means of 
differentiating berween the two. The use of HA-red has been shown to resolve 
sequences with an AT variation of less than 1 % (Wawer et ai 1995). whilst Hanse 
Analytik have reponed that HA staining is so sensitive that in one case it was used 
to distinguish rwo 567bp sequences which^differed by only a single point mutation 
(Hanse Analytik 1 996, personal communication). Therefore, if one wishes to check 
whether all the clones produced from a specific band in a differential display 
-experiment-are derived from trie- same gene s p eci e s, a small-amount of reamplined 
or digested clone can be run on a standard high resolution gel, and a second aliquot 
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Differential gem expremnn ~ r . 

Extraction of differennaliv expressed banas from a pel can oe compie.x since :r. 
some cases (e.g. DD, GEF), the results are visualized by autoradioerapnic means, 
such that precise overlay of the developed him on the gei mus: occur if tne correct 
band is to be extracted for further analysis. Clearly, a mis.iuceed extraction can 
account for many man-hours lost. This problem, ana that o: tne use of radioisotopes, 
has been addressed by several groups. For exampte. - Lonmann et ai tl Q Q5) 
demonstrated thai silver staining can be used directly to visuaj::e DD bands in 
horizontal PAGs. An et al. (1996) avoided the use of raaioisotcses bv transferring a 
small amount (20-30 ° 0 ) of the DNA from their DD to a nvion memorane. and 
visualizing the bands using chemiluminescent staining before gome back to exrrac: 
the remaining DNA from the gel. Chen and Peck (1996) wen: one step further ana 
transferred the entire DD to a nylon membrane. The DNA banas were tnen 
visualized using a digoxigenin (DIG) system (DIG was artacnea to the polvdT 
primers used in the differential display procedure). Differentially expressed bands 
were cut from the membrane and the DNA eluted by washing with PCR buffer prior 
to ^amplification. 

One of the advantages of using techniques such as 5SH and RDA is that the hnal 
display can be run on an agarose gel and the bands visualized with simple ethidium 
bromide staining. Whilst this approach can provide acceptable results, overstaming 
with 5YBR Green I or 5YBR Gold nucleic acid stains (FMO effectively enhances 
the intensity and sharpness of the bands. This greatly aids m their precise extraction 
and often reveals some faint products that may otherwise be overlooked. Whilst 
differential displays stained with SYBR Green I are better visualized using short 
wavelength UV (254 nm) rather than medium wavelength (306 nm). the shorter 
wavelength is much more DNA damaging. In practice, it takes only a few seconds 
to damage DNA extracted under 254 nm irradiation, effectively preventing 
reamplincation and cloning. The best approach is to overstain with SYBR Green I 
and extract bands under a medium wavelength UV transillumination. 



The possible use of 1 micro fingerprinting ' to reduce complexity 

Given the sheer number of gene products and the possible complexity of each 
band, an alternative approach to rapid characterization may be to use an enhanced 
analysis of a small section or a differential display — a ' sub-hneerpnnt ' or 'micro- 
rinfferpnnt*. In this case, one couia concentrate on tnose banas wnich oniy appear 
in a particular cnosen size region. Reducing tne hneerpnm m mis way nas at least 
rwo advantages. One is that it should be possible to use different gel rypes, 
concentrations and run times tailored exactly to that region. Currently, one might 
run products from 1 00-3000 — bp on the same gel, which leads to compromize in the 
gel system being used and consequently to suboptimal resolution, both in terms of 
size and numbers, and can lead to problems in the accurate excision of individual 
bands. Secondly, it may be possible to enhance resolution by using a 2-D analysis 
using a HA-stajn, as described earlier. In summary, if a range of gene product sizes 
is carefully chosen to included certain ' relevant ' genes, the 2-D system standardized, 
and appropriate gene analysis used, it may be possible to develop a method for the 
early and rapid identification of compounds which have similar or widely difTerent 
"cellular effects. If the prognosis for exposure to one or more other chemicals which 
display a similar^ profile is already known, then one co uld perhaps predict similar 
effects for any new compounds which show a similar micro-fingerprint. 
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Atlas cDNA Expression Array ,enes, alreadv ^paied " ^ d' 0 """ 
grouping together genes solved m different responses e g C o, , " " D V J* 
aamage response etc. ■ H *■ 5 - css A- 



Screening 

^a/j? positives 

The generation of false positives has been discussed at length agones- -he 

1994. Somptyne at a/. 19951. The reason for false positives var.es w.ch the 
tecnnique being used. For instance m RDi ,k - • 

K^n hpi r~ c . ,nstance - m Rt, A. the use ot aaaptors which have not 

4 „ u f PKW Wh ' Ch d ° U " dlt8 ° Wmdia,™, for ..c.W.l „J£ 
A qu.ck .cr^nms of pun™ d.fW.JIy „p„ !!e c done, cm b c c,mrf?« 

t t*k rt ,,Mu. ■ t • M . -since tne SSM method enriches rare sequences 
.t should be possible to conn™ the presence of clones representing low abundant 

o^nal ^stir"-"'™* ^ thert ,S 51,11 the "« d » « ^ to S 
3ach^h« K "nann tne aJterea express.on usme a more ouanntanve 
approach Although this mav be achieved using Northern blots :ne sens.nvwv ,1 
poor by today's high standards and one must relv on PCR methods tor "c3Tand 
sensitive determinations (see below). ^uraie ana 



Sequence analysis 

bar^WoS loS"™^ ^ Pr ° CedUr " Pr ° dUCe ftnal Pr0duc " wh ' ch « 
between 100 and lOOObp in size. However, this may considerablv reduce the size of 

he sequence for analysis of the DNA databases. Th„ ,n rum leads to a reduced 

confidence m the result-several famil.es of genes have members whose DNA 

-fences ire- almosr id enciud nu.pi m j f e u t e y stretches, e g the cytochrome 

P4.0 gene superfam.lv (Nelson et at. 1996). Thus, does the clone dentins 

almost identical to gene X, reaJly come from that gene, or .ts brother gene X, orTu 

as yet undiscovered sister X, ? FoTex.mpIe/u-sing SSH; pan of a gene wa, .Wed 
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brother gene X, or its 
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which was up-regulated m the iiver of rats exposed to Wv. 14.64? and was iacr.::he = 
by a FASTA search as being transferrin (data not shown) However. transrer::r. is 
known to be downregulated by hypolipidemic peroxisome proiirerarors such as Wv. 
14.643 (Hem e: ai. 1996). and this was confirmed uitr. suosecuen: RT-PCR 
analysis. This suggests that the gene sequence isolated mav beion? to a gene wrnch 
is closely related to transferrin, but is regulated by a different mecnanism. 

A further problem associated with 5H technology is recunc3nc\ In most cases 
before SH is carried out. the cDNA population must hrst be simpiihed bx restriction 
digestion. This is important for. at least two reasons: 

(1) To reduce complexity— long cDNA fragments mav lorm complex networks 
which prevent the formation of appropriate hybrids. espec;aii\ a: the nign 
concentrations required for efficient hybridization. 

(2) Cutting the cDNAs into small fragments provides better representation of 
individual genes. This is because genes derived from related but distinct 
members of gene families often have similar coding sequences that may cross- 
hybndize and be eliminated during the subtraction procedure i Ko 1990). 
Furthermore, different fragments from the same cDNA mav differ considerably 
in terms of hybridization and amplification and. thus, mav not efficiently do one 
or the other (Wang and Brown 1991). Thus, some fragments from differentially 
expressed cDNAs may be eliminated during subtractive hybridization pro- 
cedures. However, other fragments may be enriched and isolated. As a 
consequence of this, some genes will be cut one or more times, giving rise to rwo 
or more fragments of different sizes. If those same genes are differentially 
expressed, then rwo or more of the different size fragments may come through 
as separate bands on the hnal differential display, increasing the observed 
redundancy and increasing the number of redundant sequencing reactions. 

Sequence comparisons also throw up another important point— at what degree 
of sequence similarity does one accept a result. Is 90 ° 0 identuiv between a gene 
derived from your model species and another acceptably close*- Is 95 0 o between 
your sequence and one from the same species also acceptable : This problem is 
particularly relevant when the forward and reverse sequence comparisons give 
similar sequences with completely different gene species' An arbitrarv decision 
seems to be to allocate genes mat are aermite <Q? u n ana above similanrvi and then 
group those berween oO and 95 c 0 as oemg related or possible nomoloeues. 



Quantitative analysis 

At some point, one must give consideration to the quantitative anaivsis of the 
candidate genes, either as a means of confirming that they are truly differentially 
expressed, or in order to establish just what the differences are. Northern blot 
anaJysis is a popular approach as it is relatively easy and quick to perform. However, 
the major drawback with Northern blots is that they are often not sensmve enough 
to detect rare sequences. Since the majority of messages expressed in a cell are of low 
abundance (see table 1 ). this is a major problem. Consequently. RT-PCR may be the 
—method of choice- for eonnnmnyd iff eie n n al e jipi e sMun. A lthough the procedure is 
somewhat more complex than Northern analysis, requiring synthesis of primers and 
optimization of reaction conditions for each gene species, it is now possible to set up 
high throughput PCR syst'ems'using mulitchinnel pipettes, 96 t- -well plates and 
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approbate thermal cychng technology. Whiis: quar.:::ar,v e ar.aiv,;, IS -, c . e 
desirable, being more accurate and w.tnou: reliance or. an interna! stancarc" -~e 
money and t ,me needed to develop a competitor mo.ecu.e :s otter, excess, ve 
especially when one might be examuung tens or even hunarec, o: gene speces The 
use of semiquantitative analysis ,s s.mpier. althougn still re, a: „ en evolved One 
must nrst of all cnoose an internal standard that does no: cnange ,p tne tes- c e i 
compared to the controls. Numerous re:,-ence genes nave Deen : * ec ,, ,, e 

ghceraldehvdeo-pnosphate dehydrogenase (GAPDH. Wong r. al <Q04, ' 
nyororolate reductase (DHFR. Mohler and Butler !99n 5-'-m,c-o B iob-^ 

m a i x^t y h aL 19 T h ?° xanthme p h ° s P h -'bosv! rranslerMe 'hprt:"f 0 i;;; 

«nd.J5 r M ' ° f ° the " ( ClonT «hn lq ues 1997b, Ideallv. an internal 

standard should not cnange ,ts level of express.on ,n the cell regardless of cell a« 
stage m tne cell cycle or through the effects of external stimuli However ,t ha been 
snown on numerous occasions that the levels of most housekeeping genes current" 
used by the research community do ,n fact change under certain condition" Z n 

hmma™ ^"J"^"" ^ * » -peranve. therefore, tha", - 

hm.nary experiments be earned out on a panel of houseKeep.ng genes to estabhsh 
tneir suitaoiliry tor use in the model svstem. "»w«n 
Interpretation of quantitative data must also be treated w,th caution Bv 
comparing tne hsrs of genes identified by differential expression one can perhap 
gam tnsight into why rw 0 different speces react ,n different w.v, to eternal snm^ 

ran« o^e o T" ^ W -ects of ^ 

range of peroxisome prol.terators whilst Svnan hamsters and guinea p.gs are largelv 
resistant (Onon « al. 1984. Rodncks and Turnbull 1987. Lake « al 989 199 
Makowska « al 1992). A simplified approach to resolv.ng the reason( ) why t ' 

lZZ:lZ\ r d0Wn -^" d ««« » order to'dennrv those which a 
expressed in only one species and. through background knowledge of the effects of 

rVr^ecnon of ™'*™L carcinogen s» 

or protection. Of course, the situation is likely to be - more complex Perhaps if 
there were on, .key gene protecting guinea p, g from r.: - -genotox.c effects and I , t w as 
upreg.lated ,0 tune, by PPs. the same gene might only be up-regulated five tm" 
in the rat. However, since both were noted to be upreguia:,, :ne fmoon^ce o "h 
gene may be overlooked. Just to compete matters. 4 ,, rM cnange ,n "press on 
aoes not necessaruy mean a bioiogicallv ,m P onanr change. For examp.e. w n Yt ,s me 
true relevance or gene Y which shows a 50-fold increase after a panicuUr treatment 

To £1 histoncally gene Y has often been shown to be up-regulated 4<M, 0 - 
olr .L / " imUl, - in h * M ^0-fold increase would 

«Tir T ! VCr - 1,terarU " may Sh ° w that * ene 2 h " "-er been 

recorded as having more than doubled ,n express.on-which makes your 5-fold 

Z7rZ C K ,7" ""^ Perh2PS CVCn m ° rC mterestin S - ,f that same 5-fold 
chem"^ rClated ne0pIis " mTor follow ™* treatment with related 

Prbblemi En "vmrTg the'difTerendaT'dispiay approach 

Dtfferenual c.splay technology originally held promise of an eas..-, obtainable 
ftngerpnnt of mose genes wruch are up- or down-regulated in test animals/cell, m 
. developmental process or followmg exposure to g.ven stimuli. However h« 
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become clear that the fingerprinting process, whilst still valid, is much too compiex 
to be represented by a single technique profile. This is because ail differential display 
techniques have common and/or unique technical problems which preclude tne 
isolation and identification of all those genes which show changes m expression 
Furthermore, there are important genetic changes related :o cisease aeveiopment 
which differential expression analysis is simply not designed to accress An example 
of this is the presence of small deletions, insertions, or point mutations such as mose 
seen in activated oncogenes, rumour suppressor genes anc mciviauai poK - 
morphisms. Polymorphic variations, small though they usually are. are orren 
regarded as being of paramount importance in explaining whv some patients 
respond better than others to certain drug treatments land, m logical extension, wnv 
some people are less affected by potentially dangerous xenobiotics carcinogens than 
others). The identification of such point mutations and naturally occurring 
polvmorphisms requires the subsequent application of sequencing. 55CP. DGGE 
or TGGE to the gene of interest. Furthermore, differential disoiav is not designed 
to address issues such as alternatively spliced gene species or whether an increased 
abundance of mRNA is a result of increased transcription or increased mRNA 
stabilitv. 



Conclusions 

Perhaps the main advantage of open system differential display techniques is that 
they are not limited by extant theories or researcher bias in revealing genes which are 
differentially expressed, since they are designed to amplify all genes which 
demonstrate altered expression. This means that they are useful for the isolation of 
previously unknown genes which may turn out be useful biomarkers of a particular 
state or condition. At least one open system (SAGE) is also quantitative, thus 
eliminating the need to return to the original mRNA and earn- out Northem/PCR 
analysis to confirm the result. However, the rapid progress of genome mapping 
projects means that over the next 5-10 years or so, the balance of experimental use 
will switch from open to closed differential display svstems. particularly DNA 
arravs. Arravs are easier and faster to prepare and use. provide auanntanve data, are 
suitable for high throughput anajvsis ana can be tailored to iook at specihe signaling 
pathways or families of genes. Idennhcation of all the gene sequences in human and 
common laboratory animals combined with improved DNA arrav technology, 
means that it will soon no longer be necessary to try to isolate differentially expressed 
genes using the technically more demanding open system approach. Thus, their 
. jruin advantage (that of identifying unknown genes) will be largely eradicated. It is 
likely, therefore, that their sphere of application will be reduced to analysis of the 
less common laboratory species, since it will be some time yet before the genomes of 
such animals as zebrafish, electric eels, gerbils, crayfish and squid, for example, will 
be sequenced. 

Of course, in the end the question will always remain: What is the functional/ 
biological significance of the identified, differentially expressed genes? One 
persistent problem is understanding whether differentially expressed genes are a 
_ cause or consequence of the altered state. Furthermore, many chemicals, such as 
non-genotoxic carcinogens, are also mitogens and so genes associated with 
rephcanon will also be upregulated but may have little or nothing to do with the 
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carcinogenic effect. Whii,: dmerem.al c.splav technology ca~cr W , 
tnese questions, i: does provide a 5Bn ^K fl L t I ^ 0pf t0 

«d functionaj srud.es ^HSS^ -ennncano,. regu .„orv 
cellular responses is almost J^,7tew^u7^ of 
of those genes and their cond.non it VZZ^T " ™ tUnct,on 

d.splav can be likened to a .till mUU " d) In " aDstr ^ »««e. differential 

- Cons.der th^t^n 1^0^^ 3 ^ ^ » 
and condition of the troops before the h^L ^ ?' ace ™~ 

deduce how the battle pr /re ^ Id w " ^ ? " ^ " d 

photographs-an impossibl/usk. Intder to 1 e 4^ 

must nnd out the capabilities and motivation of the so^ ' LT 
officers, what the orders were and whether thev S^^™™?™"™" 1 ' 
terras the remams of the battle and consider the effect mu " exam,ne ™< 
conditions exerted. Likew.se. if mechanistic lw e n Z t 

~ and dose re^^ ^^.^^^ 
importance of differential irene ornfili. , J has em PriasiMd the 

-e full impact of S w^h^ ^LlT' " ^ "* 
funcnonal genomics and proteom C ZZTrlt " " C °™'™'° n 
focu,n g and subsequent SDS .Si-tS^S „t ^ "f" 
electrophoresis). Proteom.cs ,s attracting murh -D-maps "»ng capillary 

changes resulting in differential IT "cent attention as many of the 

Protem phosphoryU^ 

proteom,c technologies for mve^iganon ' fUnCU ° nal S™ 0 ™" ° r 

changes ite^Z^^Zfi ^ Ch —' n * ^ *-et,c 

to chemical or bio^ ^ i^ h ^ ^ Md ^ 

provide a 'fingenmnf of * functional data, such profiling will 

«nn should he*^ ° r ^ « the' Ion, 

^«ofchem,cal^ 

*erapeunc benefits of'ut^ 

measurable. Amongst other rh^ molecular cnanses are almost un- 

m<m effiociou, " "»P'-"'_»'kl^n, S£ riup, „d,c« the 
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ABSTRACT The recent ability to sequence whole genomes 
allows ready access to all genetic material. The approaches 
outlined here allow automated analysis of sequence for the 
synthesis of optimal primers in an automated multiplex 
oligonucleotide synthesizer (AMOS). The efficiency is such 
that all ORFs for an organism can be amplified by PCR. The 
resulting amplicons can be used directly in the construction of 
DNA arrays or can be cloned for a large variety of functional 
analyses. These tools allow a replacement of single-gene 
analysis with a highly efficient whole-genome analysis. 



The genome sequencing projects have generated and will 
continue to generate enormous amounts of sequence data. The 
genomes of Saccharomyces cerevisiae, Escherichia coli, Hae- 
mophilus influenzae ( 1 ), Mycoplasma genitalium (2), and Meth- 
anococcus jannaschii (3) have been completely sequenced. 
Other model organisms have had substantial portions of their 
genomes sequenced as well, including the nematode Caeno- 
rhabditis elegans (4) and the small flowering plant Arabidopsis 
thaliana (5). This massive and increasing amount of sequence 
information allows the development of novel experimental 
approaches to identify gene function. 

One standard use of genome sequence data is to attempt to 
identify the functions of predicted open reading frames 
(ORFs) within the genome by comparison to genes of known 
function. Such a comparative analysis of all ORFs to existing 
sequence data is fast, simple, and requires no experimentation 
and is therefore a reasonable first step. While finding sequence 
homologies/motifs is not a substitute for experimentation, 
noting the presence of sequence homology and/or sequence 
motifs can be a useful first step in finding interesting genes, in 
designing experiments and, in some cases, predicting function. 
However, this type of analysis is frequently uninformative. For 
example, over one-half of new ORFs in S. cerevisiae have no 
known function (6). If this is the case in a well studied organism 
such as yeast, the problem will be even worse in organisms that 
are less well studied or less manipulable. A large, experimen- 
tally determined gene function database would make homol- 
ogy/motif searches much more useful. 

Experimental analysis must be performed to thoroughly 
understand the biological function of a gene product. Scaling 
up from classical "cottage industry" one-gene-oriented ap- 
proaches to whole-genome analysis would be very expensive 
and laborious. It is clear that novel strategies are necessary to 
efficiently pursue the next phase of the genome projects — 
whole-genome experimental analysis to explore gene expres- 
sion, gene product function, and other genome functions. 
Model organisms, such as S. cerevisiae, will be extremely 
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important in the development of novel whole-genome analysis 
techniques and, subsequently, in improving our understanding 
of other more complex and less manipulable organisms. 

The genome sequence can be systematically used as a tool 
to understand ORFs, gene product function, and other ge- 
nome regions. Toward this end, a directed strategy has been 
developed for exploiting sequence information as a means of 
providing information about biological function (Fig. 1). Ef- 
forts have been directed toward the amplification of each 
predicted ORF or any other region of the genome ranging 
from a few base pairs to several kilobase pairs. There are many 
uses for these amplicons — they can be cloned into standard 
vectors or specialized expression vectors, or can be cloned into 
other specialized vectors such as those used for two-hybrid 
analysis. The amplicons can also be used directly by, for 
example, arraying onto glass for expression analysis, for DNA 
binding assays, or for any direct DNA assay (7). As a pilot 
study, synthetic primers were made on the 96-well automated 
multiplex oligonucleotide synthesizer (AMOS) instrument (8) 
(Fig. 2). These oligonucleotides were used to amplify each 
ORF on yeast chromosome V. The current version of this 
instrument can synthesize three plates of 96 oligonucleotides 
each (25 bases) in an 8-hr day. The amplification of the entire 
set of PCR products was then analyzed by gel electrophoresis 
(Fig. 3). Successful amplification of the proper length product 
on the first attempt was 95%. This project demonstrates that 
one can go directly from sequence information to biological 
analysis in a truly automated, totally directed manner. 

These amplicons can be incorporated directly in arrays or 
the amplicons can be cloned. If the amplicons are to be cloned, 
novel sequences can be incorporated at the 5' end of the 
oligonucleotide to facilitate cloning. One potential problem 
with cloning PCR products is that the cloned amplicons may 
contain sequence alterations that diminish their utility. One 
option would be to resequence each individual amplicon. 
However, this is expensive, inefficient, and time consuming. A 
faster, more cost-effective, and more accurate approach is to 
apply comparative sequencing by denaturing HPLC (9). This 
method is capable of detecting a single base change in a 2-kb 
heteroduplex. Longer amplicons can be analyzed by use of 
appropriate restriction fragments. If any change is detected in 
a clone, an alternate clone of the same region can be analyzed. 
Modifying the system to allow high throughput analysis by 
denaturing HPLC is also relatively simple and straightforward. 

If amplicons are used directly on arrays without cloning, it 
is important to note that, even if single PCR product bands are 
observed on gels, the PCR products will be contaminated with 
various amounts of other sequences. This contamination has 
the potential to affect the results in, for example, expression 
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Fig 1 Overview of systematic method for isolating individual 
genes Sequence information is obtained automatically from sequence 
databases. The data are input into primer selection software specifi- 
cally designed to target ORFs as designated by database annotations. 
The output file containing the primer information is directly read by 
a high-throughput oligonucleotide svnthesizer, which makes the oli- 
gonucleotides in 96-well plates (AMOS, automated multiplex oligo- 
nucleotide synthesizer). The forward and reverse primers are synthe- 
sized in the same location on separate plates to facilitate the down- 
stream handling of primers The amphcons are generated by PCR in 
96-well plates as well. 

analysis. On the other hand, direct use of the amplicons is 
much less labor intensive and greatly decreases the occurrence 
of mistakes in clone identification, a ubiquitous problem 
associated with large clone set archiving and retrieving. 

Any large-scale effort to capture each ORF within a genome 
must rely on automation if cost is to be minimized while 
efficiency is maximized. Toward that end, primers targeting 
ORFs were designed automatically using simple new scripts 
and existing primer selection software. These script-selected 
primer sequences were directly read by the high-throughput 
synthesizer and the forward and reverse primers were synthe- 
sized in separate plates in corresponding wells to facilitate 
automated pipetting and PCR amplifications. Each of the 
resulting PCR products, generated with minimum labor, con- 
tains a known, unique ORF. 

Large-scale genome analysis projects are dependent on 
newly emerging technologies to make the studies practical and 
economically feasible. For example, the cost of the primers, a 
significant issue in the past, has been reduced dramatically to 
make feasible this and other projects that require tens of 
thousands of oligonucleotides. Other methods of high- 
throughput analysis are also vital to the success of functional 
analysis projects, such as microarraying and oligonucleotide 
chip methods { 10-14). 

Changes in attitude are also required. One of the major costs 
of commercial oligonucleotides is extensive quality control 
such that virtually \00^c of the supplied oligonucleotides are 
successfully synthesized and work for their intended purpose. 
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Fig 2 Overall approach for using database of a genome to direct 
biological analysts. The synthesis of the 6.000 ORFs (orfs) for each 
gene of S. cerensiae can be used in many applications utilizing both 
cloning and microarraying technology. 

Considerable cost reduction can be obtained by simply de- 
creasing the expected successful synthesis rate to 95-91%. One 
can then achieve faster and cheaper whole genome coverage by 
simply adding a single quality control at the end of the 
experiment and batching the failures for resynthesis. 

The directed nature of the amplicon approach is of clear 
advantage. The sequence of each ORF is analyzed automati- 
cally, and unique specific primers are made to target each 
ORF. Thus, there is relatively little time or labor involved — for 
example, no random cloning and subsequent screening is 
required because each product is known. In the test system, 
primers for 240 ORFs from chromosome V were systematically 
synthesized, beginning from the left arm and continuing 
through to the right arm. At no point was there any manual 
analysis of sequence information to generate the collection. In 
many ways, now that the sequence is known, there is no need 
for the researcher to examine it. 

These amplicons can be arrayed and expression analysis can 
be done on all arrayed ORFs with a single hybridization (10). 
Those ORFs that display significant differential expression 
patterns under a given selection are easily identified without 
the laborious task of searching for and then sequencing a clone. 
Once scaled up, the procedure provides even greater returns 
on effort, because a single hybridization w ill ultimately provide 
a "snapshot" of the expression of all genes in the yeast genome. 
Thus, the limiting factor in whole genome analysis will not be 
the analysis process itself, but will instead be the ability of 
researchers to design and carry out experimental selections. 

Current expression and genetic analysis technologies are 
geared toward the analysis of single genes and are ill suited to 
analyze numerous genes under many conditions. Additional 
difficulties with current technologies include: the effort and 
expense required to analyze expression and make mutants, the 
potential duplication of effort if done by different laboratories, 
and the possibility of conflicting results obtained from differ- 
ent laboratories. In contrast, whole genome analysis not only 
is more efficient, it also provides data of much higher quality; 
all genes are assayed and compared in parallel under exactly 
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Fig. 3. Gel image of amplifications. Using the method described in Fig. 1, amplicons were generated for ORFs of S. cerevisiae chromosome 
V. One plate of 96 amplification reactions is shown. 



the same conditions. In addition, amplicons have many appli- 
cations beyond gene expression. For example, one recent 
approach is to incorporate a unique DNA sequence tag, 
synthesized as part of each gene specific primer, during 
amplification. The tags or molecular bar codes, when reintro- 
duced into the organism as a gene deletion or as a gene clone, 
can be used much more efficiently than individual mutations 
or clones because pools of tagged mutants or transformants 
can be analyzed in parallel. This parallel analysis is possible 
because the tags are readily and quantitatively amplified even 
in complex mixtures of tags (13). 

These ORF genome arrays and oligonucleotide tagged 
libraries can be used for many applications. Any conventional 
selection applied to a library that gives discrete or multiple 
products can use these technologies for a simple direct read- 
out. These include screens and selections for mutant comple- 
mentation, overexpression suppression (15, 16). second-site 
suppressors, synthetic lethality, drug target overexpression 
(17), two-hybrid screens ( 18), genome mismatch scanning (19), 
or recombination mapping. 

The genome projects have provided researchers with a vast 
amount of information. These data must be used efficiently 
and systematically to gain a truly comprehensive understand- 
ing of gene function and, more broadly, of the entire genome 
which can then be applied to other organisms. Such global 
approaches are essential if we are to gain an understanding of 
the living cell. This understanding should come from the 
viewpoint of the integration of complex regulatory networks, 
the individual roles and interactions of thousands of functional 
gene products, and the effect of environmental changes on 
both gene regulatory networks and the roles of all gene 
products. The time has come to switch from the analysis of a 
single gene to the analysis of the whole genome. 

Support was provided bv National Institutes of Health Grants 
R37H60198 and PO1H6O0205 
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INTRODUCTION 

Technological advancements combined with in- 
tensive DNA sequencing efforts have generated an 
enormous database of sequence information over the 
past decade. To date, more than 3 million sequences, 
totaling over 2.2 billion bases [1], are contained 
within the GenBank database, which includes the 
complete sequences of 19 different organisms [2]. The 
first complete sequence of a free-living organism, 
Haemophilus influenzae, was reported in 1995 [3] and 
was followed shortly thereafter by the first complete 
sequence of a eukaryote, Saccharomyces cervisiae [4]. 
The development of dramatically improved sequenc- 
ing methodologies promises that complete elucida- 
tion of the Homo sapiens DNA sequence is not far 
behind [5]. 

To exploit more fully the wealth of new sequence 
information, it was necessary to develop novel meth- 
ods for the high-throughput or parallel monitoring 
of gene expression. Established methods such as 
northern blotting, RNAse protection assays, SI nu- 
clease analysis, plaque hybridization, and slot blots 
do not provide sufficient throughput to effectively 
utilize the new genomics resources. Newer methods 
such as differential display [6], high-density filter 
hybridization [7,8], serial analysis of gene expression 
[9], and cDNA- and oligonucleotide-based microarray 
"chip" hybridization [10-12] are possible solutions 
to this bottleneck. It is our belief that the microarray 
approach, which allows the monitoring of expres- 
sion levels of thousands of genes simultaneously, is 
a tool of unprecedented power for use in toxicology 
studies. 



Almost without exception, gene expression is al- 
tered during toxicity, as either a direct or indirect 
result of toxicant exposure. The challenge facing 
toxicologists is to define, under a given set of ex- 
perimental conditions, the characteristic and spe- 
cific pattern of gene expression elicited by a given 
toxicant. Microarray technology offers an ideal plat- 
form for this type of analysis and could be the foun- 
dation for a fundamentally new approach to 
toxicology testing. 

MICROARRAY DEVELOPMENT AND APPLICATIONS 

cDNA Microarrays 

In the past several years, numerous systems were 
developed for the construction of large-scale DNA 
arrays. All of these platforms are based on cDNAs 
or oligonucleotides immobilized to a solid sup- 
port. In the cDNA approach, cDNA (or genomic) 
clones of interest are arrayed in a multi-well for- 
mat and amplified by polymerase chain reaction. 
The products of this amplification, which are usu- 
ally 500- to 2000-bp clones from the 3' regions of 
the genes of interest, are then spotted onto solid 
support by using high-speed robotics. By using 
this method, microarrays of up to 10 000 clones 
can be generated by spotting onto a glass substrate 
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[13,14]. Sample detection for microarrays on glass 
involves the use of probes labeled with fluores- 
cent or radioactive nucleotides. 

Fluorescent cDNA probes are generated from con- 
trol and test RNA samples in single-round reverse-tran- 
scription reactions in the presence of fluorescently 
tagged dUTP {e.g., Cy3-dUTP and Cy5-dUTP), which 
produces control and test products labeled with dif- 
ferent fluors. The cDNAs generated from these two 
populations, collectively termed the "probe/' are then 
mixed and hybridized to the array under a glass cov- 
erslip [10,11,15]. The fluorescent signal is detected 
by using a custom-designed scanning confocal mi- 
croscope equipped with a motorized stage and lasers 
for fluor excitation [10, 1 1, 15]. The data are analyzed 
with custom digital image analysis software that de- 
termines for each DNA feature the ratio of fluor 1 to 
fluor 2, corrected for local background [16,17]. The 
strength of this approach lies in the ability to label 
RNAs from control and treated samples with differ- 
ent fluorescent nucleotides, allowing for the simul- 
taneous hybridization and detection of both 
populations on one microarray. This method elimi- 
nates the need to control for hybridization between 
arrays. The research groups of Drs. Patrick Brown and 
Ron Davis at Stanford University spearheaded the 
effort to develop this approach, which has been suc- 
cessfully applied to studies of Arabidopsis thaliana 
RNA [10], yeast genomic DNA [15], tumorigenic ver- 
sus non-tumorigenic human tumor cell lines [11], 
human T-cells [18], yeast RNA [19], and human in- 
flammatory disease-related genes [20], The most dra- 
matic result of this effort was the first published 
account of gene expression of an entire genome, that 
of the yeast Saccharomyces cervisiae [21]. 

In an alternative approach, large numbers of cDNA 
clones can be spotted onto a membrane support, al- 
beit at a lower density [7,22]. This method is useful 
for expression profiling and large-scale screening and 
mapping of genomic or cDNA clones [7,22-24]. In 
expression profiling on filter membranes, two dif- 
ferent membranes are used simultaneously for con- 
trol and test RNA hybridizations, or a single 
membrane is stripped and reprobed. The signal is 
detected by using radioactive nucleotides and visu- 
alized by phosphorimager analysis or autoradiogra- 
phy. Numerous companies now sell such cDNA 
membranes and software to analyze the image data 
[25-27]. 

Oligonucleotide Microarrays 

Oligonucleotide microarrays are constructed either 
by spotting prefabricated oligos on a glass support 
[13] or by the more elegant method of direct in situ 
oligo synthesis on the glass surface by photolithog- 
raphy [28-30]. The strength of this approach lies in 
its ability to discriminate DNA molecules based on 
single base-pair difference. This allows the applica- 
tion of this method to the fields of medical diagnos- 



tics, pharmacogenetics, and sequencing by hybrid- 
ization as well as gene-expression analysis. 

Fabrication of oligonucleotide chips by photoli- 
thography is theoretically simple but technically 
complex [29,30]. The light from a high-intensity 
mercury lamp is directed through a photolitho- 
graphic mask onto the silica surface, resulting in 
deprotection of the terminal nucleotides in the illu- 
minated regions. The entire chip is then reacted with 
the desired free nucleotide, resulting in selected chain 
elongation. This process requires only 4n cycles 
(where n = oligonucleotide length in bases) to syn- 
thesize a vast number of unique oligos, the total num- 
ber of which is limited only by the complexity of the 
photolithographic mask and the chip size [29,31,32]. 

Sample preparation involves the generation of 
double-stranded cDNA from cellular poly(A)+ RNA 
followed by antisense RNA synthesis in an in vitro 
transcription reaction with biotinylated or fluor- 
tagged nucleotides. The RNA probe is then frag- 
mented to facilitate hybridization. If the indirect 
visualization method is used, the chips are incubated 
with fluor-linked streptavidin (e.g., phycoerythrin) 
after hybridization [12,33]. The signal is detected with 
a custom confocal scanner [34]. This method has 
been applied successfully to the mapping of genomic 
library clones [35], to de novo sequencing by hybrid- 
ization [28,36], and to evolutionary sequence com- 
parison of the BRCA1 gene [37]. In addition, 
mutations in the cystic fibrosis [38] and BRCA1 [39] 
gene products and polymorphisms in the human im- 
munodeficiency virus-1 clade B protease gene [40] 
have been detected by this method. Oligonucleotide 
chips are also useful for expression monitoring [33] 
as has been demonstrated by the simultaneous evalu- 
ation of gene-expression patterns in nearly all open 
reading frames of the yeast strain 5. cerevisiae [12]. 
More recently, oligonucleotide chips have been used 
to help identify single nucleotide polymorphisms in 
the human [41] and yeast [42] genomes. 

THE USE OF MICROARRAYS IN TOXICOLOGY 

Screening for Mechanism of Action 

The field of toxicology uses numerous in vivo 
model systems, including the rat, mouse, and rab- 
bit, to assess potential toxicity and these bioassays 
are the mainstay of toxicology testing. However, in 
the past several decades, a plethora of in vitro tech- 
niques have been developed to measure toxicity, 
many of which measure toxicant-induced DNA dam- 
age. Examples of these assays include the Ames test, 
the Syrian hamster embryo cell transformation as- 
say, micronucleus assays, measurements of sister 
chromatid exchange and unscheduled DNA synthe- 
sis, and many others. Fundamental to all of these 
methods is the fact that toxicity is often preceded 
by, and results in, alterations in gene expression. In 
many cases, these changes in gene expression are a 
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far more sensitive, characteristic, and measurable 
endpoint than the toxicity itself. We therefore pro- 
pose that a method based on measurements of the 
genome-wide gene expression pattern of an organ- 
ism after toxicant exposure is fundamentally infor- 
mative and complements the established methods 
described above. 

We are developing a method by which toxicants 
can be identified and their putative mechanisms of 
action determined by using toxicant-induced gene ex- 
pression profiles. In this method, in one or more de- 
fined model systems, dose and time-course parameters 
are established for a series of toxicants within a given 
prototypic class (e.g., polycyclic aromatic hydrocar- 
bons fPAHs)). Cells are then treated with these agents 
at a fixed toxicity level (as measured by cell survival), 
RNA is harvested, and toxicant-induced gene expres- 
sion changes are assessed by hybridization to a cDNA 
microarray chip (Figure 1 ). We have developed a cus- 
tom DNA chip, called ToxChip vl.O, specifically for 
this purpose and will discuss it in more detail below. 
The changes in gene expression induced by the test 
agents in the model systems are analyzed, and the 
common set of changes unique to that class of toxi- 
cants, termed a toxicant signature, is determined. 

This signature is derived by ranking across all ex- 
periments the gene-expression data based on rela- 



tive fold induction or suppression of genes in treated 
samples versus untreated controls and selecting the 
most consistently different signals across the sample 
set. A different signature may be established for each 
prototypic toxicant class. Once the signatures are de- 
termined, gene-expression profiles induced by un- 
known agents in these same model systems can then 
be compared with the established signatures. A match 
assigns a putative mechanism of action to the test 
compound. Figure 2 illustrates this signature method 
for different types of oxidant stressors, PAHs, and 
peroxisome proliferators. In this example, the un- 
known compound in question had a gene-expres- 
sion profile similar to that of the oxidant stressors in 
the database. We anticipate that this general method 
will also reveal cross talk between different pathways 
induced by a single agent (e.g., reveal that a com- 
pound has both PAH-like and oxidant-like proper- 
ties). In the future, it may be necessary to distinguish 
very subtle differences between compounds within 
a very large sample set (e.g., thousands of highly simi- 
lar structural isomers in a combinatorial chemistry 
library or peptide library). To generate these highly 
refined signatures, standard statistical clustering tech- 
niques or principal-component analysis can be used. 

For the studies outlined in Figure 2, we developed 
the custom cDNA microarray chip ToxChip vl.O. 
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Figure 1. Simplified overview of the method for sample trative purposes, samples derived from celt culture are depicted, 

preparation and hybridization to cDNA microarrays. For illus- although other sample types are amenable to this analysis. 
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Figure 2. Schematic representation of the method for iden- 
tification of a toxicant's mechanism of action. In this method, 
gene-expression data derived from exposure of model sys- 
tems to known toxicants are analyzed, and a set of changes 
characteristic to that type of toxicant (termed the toxicant 
signature) is identified. As depicted, oxidant stressors produce 



consistent changes in group A genes (indicated by red and 
green circles), but not group B or C genes (indicated by gray 
circles). The set of gene-expression changes elicited by the 
suspected toxicant is then compared with these characteristic 
patterns, and a putative mechanism of action is assigned to 
the unknown agent. 



The 2090 human genes that comprise this subarray 
were selected for their well-documented involve- 
ment in basic cellular processes as well as their re- 
sponses to different types of toxic insult. Included 
on this list are DNA replication and repair genes, 
apoptosis genes, and genes responsive to PAHs and 
dioxin-like compounds, peroxisome proliferates, 
estrogenic compounds, and oxidant stress. Some of 
the other categories of genes include transcription 
factors, oncogenes, tumor suppressor genes, cyclins, 
kinases, phosphatases, cell adhesion and motility 
genes, and homeobox genes. Also included in this 
group are 84 housekeeping genes, whose hybridiza- 
tion intensity is averaged and used for signal nor- 
malization of the other genes on the chip. To date, 
very few toxicants have been shown to have appre- 
ciable effects on the expression of these housekeep- 
ing genes. However, this housekeeping list will be 
revised if new data warrant the addition or deletion 
of a particular gene. Table 1 contains a general de- 
scription of some of the different classes of genes 
that comprise ToxChip vl.O. 

When a toxicant signature is determined, the 
genes within this signature are flagged within the 
database. When uncharacterized toxicants are then 
screened, the data can be quickly reformatted so that 
blocks of genes representing the different signatures 



are displayed [11]. This facilitates rapid, visual in- 
terpretation of data. We are also developing Tox- 
Chip v2.0 and chips for other model systems, 
including rat, mouse, Xenopus, and yeast, for use in 
toxicology studies. 

Animal Models in Toxicology Testing 

The toxicology community relies heavily on the 
use of animals as model systems for toxicology test- 
ing. Unfortunately, these assays are inherently ex- 
pensive, require large numbers of animals and take a 
long time to complete and analyze. Therefore, the 
National Institute of Environmental Health Sciences 
(NIEHS), the National Toxicology Program, and the 
toxicology community at large are committed to re- 
ducing the number of animals used, by developing 
more efficient and alternative testing methodologies. 
Although substantial progress has been made in the 
development of alternative methods, bioassays are 
still used for testing endpoints such as neurotoxic- 
ity, immunotoxicity, reproductive and developmen- 
tal toxicology, and genetic toxicology. The rodent 
cancer bioassay is a particularly expensive and time- 
consuming assay, as it requires almost 4 yr, 1200 
animals, and millions of dollars to execute and ana- 
lyze [43]. In vitro experiments of the type outlined 
in Figure 2 might provide evidence that an unknown 
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Table 1. ToxChip v1.0: A Human cDNA Microarray 
Chip Designed to Detect Responses to Toxic Insult 





No. of genes 


Gene category 


on chip 


Apoptosis 


72 


DNA replication and repair 


99 


Oxidative stress/redox homeostasis 


90 


Peroxisome proliferator responsive 


22 


Dioxin/PAH responsive 


12 


Estrogen responsive 


63 


Housekeeping 


84 


Oncogenes and tumor suppressor genes 


76 


Cell-cycle control 


51 


Transcription factors 


131 


Kinases 


276 


Phosphatases 


88 


Heat-shock proteins 


23 


Receptors 


349 


Cytochrome P450s 


30 



* This list is intended as a general guide. The gene categories are not 
unique, and some genes are listed in multiple categories. 



agent is (or is not) responsible for eliciting a given 
biological response. This information would help to 
select a bioassay more specifically suited to the agent 
in question or perhaps suggest that a bioassay is not 
necessary, which would dramatically reduce cost, 
animal use, and time. 

The addition of microarray techniques to stan- 
dard bioassays may dramatically enhance the sen- 
sitivity and interpretability of the bioassay and 
possibly reduce its cost. Gene-expression signatures 
could be determined for various types of tissue-spe- 
cific toxicants, and new compounds could be 
screened for these characteristic signatures, provid- 
ing a rapid and sensitive in vivo test. Also, because 
gene expression is often exquisitely sensitive to low 
doses of a toxicant, the combination of gene-expres- 
sion screening and the bioassay might allow the use 
of lower toxicant doses, which are more relevant to 
human exposure levels, and the use of fewer ani- 
mals. In addition, gene-expression changes are nor- 
mally measured in hours or days, not in the months 
to years required for tumor development. Further- 
more, microarrays might be particularly useful for 
investigating the relationship between acute and 
chronic toxicity and identifying secondary effects 
of a given toxicant by studying the relationship 
between the duration of exposure to a toxicant and 
the gene-expression profile produced. Thus, a bio- 
assay that incorporates gene-expression signatures 
with traditional endpoints might be substantially 
shorter, use more realistic dose regimens, and cost 
substantially less than the current assays do. 

These considerations are also relevant for branches 
of toxicology not related to human health and not 
using rodents as model systems, such as aquatic toxi- 
cology and plant pathology. Bioassays based on the 
flathead minnow, Daphnia, and Arabadopsis could 



also be improved by the addition of microarray analy- 
sis. The combination of microarrays with traditional 
bioassays might also be useful for investigating some 
of the more intractable problems in toxicology re- 
search, such as the effects of complex mixtures and 
the difficulties in cross-species extrapolation. 

Exposure Assessment, Environmental Monitoring, 
and Drug Safety 

The currently used methods for assessment of ex- 
posure to chemical toxicants are based on measure- 
ment of tissue toxin levels or on surrogate markers 
of toxicity, termed biomarkers (e.g., peripheral blood 
levels of hepatic enzymes or DNA adducts). Because 
gene expression is a sensitive endpoint, gene expres- 
sion as measured with microarray technology may 
be useful as a new biomarker to more precisely iden- 
tify hazards and to assess exposure. Similarly, 
microarrays could be used in an environmental- 
monitoring capacity to measure the effect of poten- 
tial contaminants on the gene-expression profiles 
of resident organisms. In an analogous fashion, 
microarrays could be used to measure gene-expres- 
sion endpoints in subjects in clinical trials. The com- 
bination of these gene-expression data and more 
established toxic endpoints in these trials could be 
used to define highly precise surrogates of safety. 

Gene-expression profiles in samples from exposed 
individuals could be compared to the profiles of the 
same individuals before exposure. From this infor- 
mation, the nature of the toxic exposure can be de- 
termined or a relative clinical safety factor estimated. 
In the future it may also be possible to estimate not 
only the nature but the dose of the toxicant for a 
given exposure, based on relative gene-expression 
levels. This general approach may be particularly 
appropriate for occupational-health applications, in 
which unexposed and exposed samples from the 
same individuals may be obtainable. For example, 
a pilot study of gene expression in peripheral-blood 
lymphocytes of Polish coke-oven workers exposed 
to PAHs (and many other compounds) is under con- 
sideration at the NIEHS. An important consideration 
for these types of studies is that gene expression can 
be affected by numerous factors, including diet, 
health, and personal habits. To reduce the effects 
of these confounding factors, it may be necessary 
to compare pools of control samples with pools of 
treated samples. In the future it may be possible to 
compare exposed sample sets to a national database 
of human-expression data, thus eliminating the 
need to provide an unexposed sample from the same 
individual. Efforts to develop such a national gene- 
expression database are currently under way [44,45]. 
However, this national database approach will re- 
quire a better understanding of genome-wide gene 
expression across the highly diverse human popu- 
lation and of the effects of environmental factors 
on this expression. 
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Alleles, Oligo Arrays, and Toxicogenetics 

Gene sequences vary between individuals, and 
this variability can be a causative factor in human 
diseases of environmental origin [46,47]. A new area 
of toxicology, termed toxicogenetics, was recently 
developed to study the relationship between genetic 
variability and toxicant susceptibility. This field is 
not the subject of this discussion, but it is worth- 
while to note that the ability of oligonucleotide ar- 
rays to discriminate DNA molecules based on single 
base-pair differences makes these arrays uniquely 
useful for this type of analysis. Recent reports dem- 
onstrated the feasibility of this approach [41,42]. 
The NIEHS has initiated the Environmental Genome 
Project to identify common sequence polymor- 
phisms in 200 genes thought to be involved in en- 
vironmental diseases [48]. In a pilot study on the 
feasibility of this application to the Environmental 
Genome Project, oligonucleotide arrays will be used 
to resequence 20 candidate genes. This toxicogenetic 
approach promises to dramatically improve our un- 
derstanding of interindividual variability in disease 
susceptibility. 

FUTURE PRIORITIES 

There are many issues that must be addressed be- 
fore the full potential of microarrays in toxicology 
research can be realized. Among these are model sys- 
tem selection, dose selection, and the temporal na- 
ture of gene expression. In other words, in which 
species, at what dose, and at what time do we look 
for toxicant-induced gene expression? If human 
samples are analyzed, how variable is global gene 
expression between individuals, before and after toxi- 
cant exposure? What are the effects of age, diet, and 
other factors on this expression? Experience, in the 
form of large data sets of toxicant exposures, will 
answer these questions. 

One of the most pressing issues for array scientists 
is the construction of a national public database 
(linked to the existing public databases) to serve as a 
repository for gene-expression data. This relational 
database must be made available for public use, and 
researchers must be encouraged to submit their ex- 
pression data so that others may view and query the 
information. Researchers at the National Institutes 
of Health have made laudable progress in develop- 
ing the first generation of such a database [44,45]. In 
addition, improved statistical methods for gene clus- 
tering and pattern recognition are needed to ana- 
lyze the data in such a public database. 

The proliferation of different platforms and meth- 
ods for microarray hybridizations will improve 
sample handling and data collection and analysis and 
reduce costs. However, the variety of microarray 
methods available will create problems of data com- 
patibility between platforms. In addition, the near- 
infinite variety of experimental conditions under 



which data will be collected by different laborato- 
ries will make large-scale data analysis extremely dif- 
ficult. To help circumvent these future problems, a 
set of standards to be included on all platforms 
should be established. These standards would facili- 
tate data entry into the national database and serve 
as reference points for cross-platform and inter-labo- 
ratory data analysis. 

Many issues remain to be resolved, but it is clear 
that new molecular techniques such as microarray 
hybridization will have a dramatic impact on toxicol- 
ogy research. In the future, the information gathered 
from microarray-based hybridization experiments will 
form the basis for an improved method to assess the 
impact of chemicals on human and environmental 
health. 
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Abstract 

Recent progress in genomics and proteomics technologies has created a unique opportunity to significantly impact 
the pharmaceutical drug development processes. The perception that cells and whole organisms express specific 
inducible responses to stimuli such as drug treatment implies that unique expression patterns, molecular fingerprints, 
indicative of a drug's efficacy and potential toxicity arc accessible. The integration into state-of-the-art toxicology of 
assays allowing one to profile treatment-related changes in gene expression patterns promises new insights into 
mechanisms of drug action and toxicity. The benefits will be improved lead selection, and optimized monitoring of 
drug efficacy and safety in pre-clinical and clinical studies based on biologically relevant tissue and surrogate markers. 
© 2000 Elsevier Science Ireland Ltd. All rights reserved. 

Keywords: Proteomics; Genomics; Toxicology 



1. Introduction 

The majority of drugs act by binding to protein 
targets, most to known proteins representing en- 
zymes, receptors and channels, resulting in effects 
such as enzyme inhibition and impairment of 
signal transduction. The treatment-induced per- 
turbations provoke feedback reactions aiming to 
compensate for the stimulus, which almost always 
are associated with signals to the nucleus, result- 
ing in altered gene expression. Such gene expres- 
sion regulations account for both the 
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pharmacological action and the toxicity of a drug 
and can be visualized by either global mRNA or 
global protein expression profiling. Hence, for 
each individual drug, a characteristic gene regula- 
tion pattern, its molecular fingerprint, exists 
which bears valuable information on its mode of 
action and its mechanism of toxicity. 

Gene expression is a multistep process that 
results in an active protein (Fig. 1). There exist 
numerous regulation systems that exert control at 
and after the transcription and the translation 
step. Genomics, by definition, encompasses the 
quantitative analysis of transcripts at the mRNA 
level, while the aim of proteomics is to quantify 
gene expression further down-stream, creating a 
snapshot of gene regulation closer to ultimate cell 
function control. 
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2. Global mRNA profiling 

Expression data at the mRNA level can be 
produced using a set of different technologies 
such as DNA microarrays. reverse transcript 
imaging, amplified fragment length polymorphism 
(AFLP). serial analysis of gene expression 
(SAGE) and others. Currently. DNA microarravs 
are very popular and promise a great potential. 
On a typical array, each gene of interest is repre- 
sented either by a long DNA fragment (200-2400 
bp) typically generated by polymerase chain reac- 
tion (PCkj and spotted on a suitable substrate 
using robotics (Schena et aL 1995; Shalon et al.. 
1996) or by several short oligonucleotides (20-30 
bp) synthesized directly onto a solid support using 
photolabile nucleotide chemistry (Fodor et al.. 
1991; Chee et al., 1996). From control and treated 
tissues, total RNA or mRNA is isolated and 
reverse transcribed in the presence of radioactive 
or fluorescent labeled nucleotides, and the labeled 
probes are then hybridized to the arrays. The 
intensity of the array signal is measured for each 
gene transcript by either autoradiography or laser 
scanning confocal microscopy. The ratio between 
the signals of control and treated samples reflect 
the relative drug-induced change in transcript 
abundance. 



3. Global protein profiling 

Global quantitative expression analysis at the 
protein level is currently restricted to the use of 
two-dimensional gel electrophoresis. This tech- 
nique combines separation of tissue proteins bv 
isoelectric focusing in the first dimension and bv 
sodium dodecy] sulfate slab gel electrophoresis- 
based molecular weight separation on the seco-d. 
orthogonal dimension (Anderson et al., 1991). 
The product is a rectangular pattern of protein 
spots that are typically revealed by Coomassie 
Blue, silver or .orescent staining (Fig. 2). 
Protein spots are ^..-'Uified by mass spectrometry 
following generation of peptide mass fingerprints 
(Mann et al.. 1993) and sequence tags (Wilkins et 
al.. 1996). Similar to the mRNA approach, the 
ratio between the optica] density of spots from 
control and treated samples are compared to 
search for treatment-related changes. 

4. Expression data analysis 

Bioinformatics forms a key element required to 
organize, analyze and store expression data from 
either source, the mRNA or the protein level. The 
overall objective, once a mass of high-quality 
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Fig. 2. Computerized representation of a Coomassie Blue stained two-dimensional gel electrophoresis pattern of Fischer F344 
liver homogenate. 



quantitative expression data has been collected, is 
to visualize complex patterns of gene expression 
changes, to detect pathways and sets of genes 
tightly correlated with treatment efficacy and toxi- 
city, and to compare the effects of different sets of 
treatment (Anderson et aL, 1996). As the drug 
effect database is growing, one may detect similar- 
ities and differences between the molecular finger- 
prints produced by various drugs, information 
that may be crucial to make a decision whether to 
refocus or extend the therapeutic spectrum of a 
drug candidate. 



5. Comparison of global mRNA and protein 
expression profiling 

There are several synergies and overlaps of data 
obtained by mRNA and protein expression analy- 
sis. Low abundant transcripts may not be easily 
quantified at the protein level using standard two- 
dimensional gel electrophoresis analysis and their 
detection may require prefractionation of sam- 
ples. The expression of such genes may be prefer- 
ably quantified at the mRNA level using 
techniques allowing PCR-mediated target amplifi- 
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cation. Tissue biopsy samples typically yield good 
quality of both mRNA and proteins: however, the 
quality of mRNA isolated from body fluids is 
often poor due to the faster degradation of 
mRNA when compared with proteins. RNA sam- 
ples from body fluids such as serum or urine are 
often not very 'meaningful", and secreted proteins 
are likely more reliable surrogate markers for 
treatment efficacy and safety. Detection of post- 
translational modifications, events often related to 
function or nonfunction of a protein, is restricted 
to protein expression analysis and rarely can be 
predicted by mRNA profiling. Information on 
subcellular localization and translocation of 
proteins has to be acquired at the level of the 
protein in combination with sample prefractiona- 
tion procedures. The growing evidence of a poor 
correlation between mRNA and protein abun- 
dance (Anderson and Seiihamer. 1997) further 
suggests that the two approaches. mRNA and 
protein profiling, are complementary and should 
be applied in parallel. 

6. Expression profiling and drug development 

Understanding the mechanisms of action and 
toxicity, and being able to monitor treatment 
efficacy and safety during trials is crucial for the 
successful development of a drug. Mechanistic 
insights are essential for the interpretation of drug 
effects and enhance the chances of recognizing 
potential species specificities contributing to an 
improved risk profile in humans (Richardson et 
aL 1993; Steiner et aL 1996b; Aicher et aL 1998). 
The value of expression profiling further increases 
when links between treatment-induced expression 
profiles and specific pharmacological and toxic 
endpoints are established (Anderson et aL 1991, 
1995, 1996; Steiner et al. 1996a). Changes in gene 
expression are known to precede the manifesta- 
tion of morphological alterations, giving expres- 
sion profiling a great potential for earlv 
compound screening, enabling one to select drug 
candidates with wide therapeutic windows 
reflected by molecular fingerprints indicative of 
high pharmacological potency and low toxicity 
(Arce et aL 1998). In later phases of drug devel- 



opment, surrogate markers of treatment efficacv 
and toxicity can be applied to optimize the moni- 
toring of pre-clinical and clinical studies (Dohertv 
et aL 1998). 



7. Perspectives 

The basic methodology of safety evaluation has 
changed little during the past decades. Toxicity in 
laboratory animals has been evaluated primarily 
using hematological, clinical chemistrv and 
histological parameters as indicators of orsan 
damage. The rapid progress in genomics and pro- 
teomics technologies creates a unique opportunitv 
to dramatically improve the predictive power of 
safety assessment and to accelerate the drug devel- 
opment process. Application of gene and protein 
expression profiling promises to improve lead se- 
lection, resulting in the development of drug can- 
didates with higher efficacy and lower toxicity. 
The identification of biologically relevant surro- 
gate markers correlated with treatment efficacv 
and safety bears a great potential to optimize the 
monitoring of pre-clinical and clinical trails. 
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DNA amv technology makes it possible to rapidly genotype individuals or quantify the expression 
of thousands of genes on a single filter or glass slide, and holds enormous potential in toxicologic 
applications. This potential led to a VS. Environmental Protection Agency-sponsored workshop 
utied "Application of Microarrayi to Toxicology" on 7-8 January 1999 in Research Triangle Park. 
North Carolina. In addition to providing state-of-the-art information on the application of DNA or 
gene mjcToarrays. the workshop catalyzed the formation of sev e ral collaborations, committees, and 
user s groups throughout the Research Triangle Park area and beyond. Potential application of 
micro arravs to toxicologic research and risk assessment include genome-wide expression analyses to 
identify gene-expression networks and toxicant-specific signatures that can be used to define mode 
of action, for exposure assessment, and for environmental monitoring. Arrays may also prove useful 
for moruioring genetic variability and its relationship to toxicant susceptibility in human popula- 
tions. Key words: DNA arrays, gene arrays, microarrays, toxicology. Environ Health Prrsptct 
107:681-685 (1999). [Online 6July 1999) 
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Decoding the genetic blueprint is a dream that 
offers manifold returns in terms or understand- 
ine how o reams ms develop and function in an 
often hostile environment. With the rapid 
advances in molecular biology over the last 30 
vears, the dream has come a step closer to reali- 
ty. Molecular biologists now have the ability to 
elucidate the composition or any genome. 
Indeed, almost 20 genomes have already been 
sequenced and more than 60 are currently 
under wav. Foremost among these is the 
Human Genome Mapping Proietx However, 
the genomes or a number ot commonly used 
laboratory species are also under intensive 
investigation, including veast, Arabidopsts, 
maize, nee. zebra fish, mouse, rat. and dog. It 
ts widely expected that the completion ot such 
programs will facilitate the development of 
manv powerrui new techniques and approach- 
ts to uiasnosmg ana treating geneucaliv ana 
L'rivironmencaDv lnducea diseases which arnict 
mankind. However, the vast amount of data 
Deing generated by genome mapping will 
require new high- throughput technologies to 
investigate the function or the millions of new 
eenes that are being reported. Among the most 
widely heralded of the new functional 
genomics technologies are DNA arrays, which 
represent perhaps the most anticipated new 
molecular biology technique since polymerase 
chain reaction (PCRj. 

Arrays enable the study of literally thou- 
sands of genes in a single experiment. The 
potential importance of arrays is enormous and 
has been highlighted by the recent publicauon 
of an entire Narure Grncna supplement dedi- 
cated to the technology (/). Despite this huge 
surge of interest. DNA arrays are still litxle used 
and largely un proven, as demonstrated by the 
high ratio of review and press articles to actual 
data papers. Even so, the potential they offer 



has driven venture capitalists into a frenzy of 
investment and many new companies are 
springing up to claim a share ot this rapidly 
developing market. 

The U.S. Environmental Protection 
Agency (EPA) is interested in applying DNA 
array technology to ongoing toxicologic stud- 
ies. To learn more about the current state of 
the technology, the Reproductive Toxicology 
Division (RTD) of the National Health and 
Environmental Effects Research Laboratory 
(NHEERL; Research Triangle Park. NC) 
hosted a workshop on "Application of 
Microarravs to Toxicology" on 7-8 January 
1999 in Research Triangle Park. North 
Carolina. The workshop was organized bv 
David Dix, Robert Kavlock. and John Rockert 
of the RTD/NHEERL. Twenrv-rwo intra- 
mural and extramural scientists from govern- 
ment, acanemia. ana inaustrv snarea informa- 
tion, data, ana opinions on tne current and 
future applications for this exciang new tech- 
nology. The workshop had more than 1 50 
attendees, including researchers, students, and 
-administrators from the EPA, the. National 
Institute of Environmental Health Sciences 
(NIEHS), and a number of other establish- 
ments from Research Triangle Park and 
beyond. Presentations ranged from the tech- 
nology behind arrav production through the 
sharing of actual experimental data and projec- 
tions on the future importance and applica- 
tions of arrays. The information contained in 
the workshop presentations should provide aid 
and insight into arrays in general and their 
applicauon to toxicology in particular. 

Array Elements 

In the context of molecular biology, the word 
u array" is normally used to refer to a series of 
DNA or protein elements firmlv attached in 



a regular pattern to some kind of supportive 
meaium. DNA arrav is orten used inter- 
cnangeabiv with gene arrav or microarrav. 
Although nor rormaiiv defined, microarrav is 
generally usea to describe the higher density 
arravs rvpicaJlv printed on glass chips. The 
DNA elements that make up DNA arravs 
can be oligonucleotides, partial sene 
sequences, or rull-iength cDNAs. Companies 
orrenng pre-made arravs that contain less 
than rull-iengrh ciones normally use resions 
or the genes which are specific to that gene to 
prevent false positives arising through cross- 
hybridization. Sequence verification of 
cDNA clone identirv is necessary because of 
errors in identifying specific clones from 
cDNA libraries and databases. Premade 
DNA arrays printed on membranes are cur- 
rently or imminently available for human, 
mouse, and rat. In most cases they contain 
DNA sequences representing several thou- 
sand different sequence clusters or genes as 
delineated through the National Center for 
Biotechnology Information UniGcne Project 
{J). Many of these different UniGene clusters 
(putative genes) are represented only by 
expressed sequence tags (ESTs). 

Array Printing 

Arrays are rypicaily printed on one of rwo 
types of support matrix. Nvlon membranes 
are used bv most off-the-shelf array providers 
such as Clontech Laboratories, Inc. 
(Palo Alto. CA). Genome Systems. Inc. (St. 
Louis. MO), and Research Genetics. Inc. 
: Huntsville. ALL Microarravs such as those 
proaucea rw Arrvmernx. inc. t Santa Clara. 
CAi. inevte Pharmaceuucals. Inc. (Palo Alto. 
CA). and many do-it-vourself (DIY) arraying 
groups use glass waters or slides. Although 
standard microscope slides may be used, they 
must be prepreparcd to facilitate sticking 
of the DNA to the glass. Several different 
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coatings have been successfully used, includ- 
ing silane and ivsine. The coating of slides 
can easiiv be carried out in trie laoorarorv. 
but many prefer tne convenience ofprecoar.ee 
slides available from suppliers. 

Once the support matrix has been pre- 
pared, me DNA elements can be applied bv 
several methods. Afrvmetrix. Inc., has devel- 
oped a unique photolithographic technology 
for attaching oligonucleotides to glass wafers. 
More commonly, DNA is applied by either 
noncontacx or contact printing. Noncontact 
printers can use thermal, solenoid, or piezoelec- 
tric technology to spray aiiquots of solution 
onto the support matrix and may be used to 
produce slide or membrane-based arravs. 
Cartesian Technologies, Inc. (Irvine, G\) has 
developed nQUAD technology for use in its 
PixSvs printers. The system couples a svnnee 
pump with the microsoienoid vajve. a combi- 
nation that provides rapid quanutauve dispens- 
ing or nanoliter volumes idown to h.2 nL) over 
a variable volume range. A different approach 
to noncontact pnntmg uses a solid pin and nng 
combination (Genetic MicroSystems. Inc.. 
^ obum, MA J. This svstem (Figure 1 J allows a 
broader range of sample, including ceil suspen- 
sions and particulates, because the printing 
head cannot be blocked up in the same way as 
a spray nozzle. Fluid transfer is controlled in 
this system primarily by the pin dimensions 
and the force of deposition, although the 
nature ot the support matrix and the sample 
will also afreet transfer to some degree. 

In contact pnntmg, the pin head is dipped 
in the sample and then touched to the support 
matrix to deposit a small aliquot. Split pins 
were one ot the first contact-printing devices 
to be reported and are the suggested format 
for DIY arravers, as described bv Brown (J). 
Split pins are small metal pins with a precise 
sroove cut vertically in the middle of the pin 
:ic. In this svstem. i— 48 spilt puis are posi- 
tioned in the pin-head Tne split Dins work bv 
sim Die capiiiaxv action, not uniiice a fountain 
pen — when the pin heads are dipped in the 
sample, liquid is drawn into the pin groove. A 
small (fixed) volume is then deposited each 
time the split pins are gently touched to 
the support matrix. Sample (100-500 pL 
depending on a variety of parameters) can be 
deposited on muluplc slides before refilling is 
required, and array densities of > 2,500 
spots/cnr may be produced. The deposit vol- 
ume depends on the split size, sample fluidi- 
ry. and the speed of printing. Split pins are 
relatively simple to produce and can be made 
in -house if a suitable machine shop is avail- 
able. Alternatively, they can be obtained 
directly from companies such as TeleChem 
International, Inc. (Sunnyvale. CA). 

Irrespective of their source, printers 
should be run through a preprint sequence 
prior to producing the actual experimental 



UTavy riie firs: 100 or so spots of a new run 
tend to oe somewhat variable. Factors errccr- 
ing spot reproducibility include slice treat- 
ment nomogene:rv. sampic cifrerences. anc 
insrrumen: errors. Other factors that come 
into piav include ciean election of tne drop 
and cioggmg mQl'AD printing and 
mechanical variations and long-term aiter- 
auon in print-head surface of solid and split 
pins. However, witn careful preparation it is 
possible to get a coefficient of variance for 
spot reproducibiiitv below- ]0V 

One potential prtnung problem is sample 
carryover. Repeated washing, blotting, and 
drving (vacuum) of pnnt pins befween samples 
is normally effective at reducing sample carry- 
over to negligible amounts. Printing should 
also be carried out in a controlled environ- 
ment. Humidified chambers are available in 
which to place printers. These help prevent 
dust contamination and produce a uniform 
drving rate, which ls important in determining 
spot size, quality, anc reprociuabiiirv. 

In summary, altnough several printing 
technologies are available, none are par- 
ticularly outstanding and the bottom line 
is that they are still m a relatively eariv srage 
or evolution. 

Array Hybridization 

The hybridization protocol is. practically 
speaking, relatively straightforward and those 
with previous experience in blotting should 
have little difficulty. Array hybridizations 
are. in essence, reverse Southern/Northern 
biots — instead of appiving a iabeied prooe to 
the target population of DNA/RNA. the 
labeled population is applied to the probets). 
With membrane-based arravs. tne controi and 
treated mRNA populations are normally con- 
vened to cDNA and iabeied with isotope ic.g.. 
-P) in the process. These iabeied populations 
are men nvnncuzea inacpenacnuv to parauei 
or senai arravs anc tne n veneration sicnx is 
dexeaed with a phosponmager. A less com- 
monly used alternative to radioactive probes is 
enzvmatic detection. The probe mav be 
biotinylared, haprenylated^or have alkaiine 
phosphatase/horseradish peroxidase attached. 
Hybridization is detected bv enzvmatic reac- 
tion yielding a color reaction (4). Differences 
in hybridization signals can be detected bv eve 
or, more accurately, with the help of digital 
imaging and commercially available software. 
The labeling of the test populaoons tor slide- 
based microarravs uses a slightly different 
approach. The probe rypicailv consists of two 
sample of polvA* RNA (usually from a treated 
and a control population) that are converted io 
cDNA; in the process each is labeled with a 
different fluor. The independently labeled 
probes are then mixed together and hybridized 
to a single rrucroarray slide and the resulting 
combined fluorescent signal is scanned. After 
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figure l. Genetic Microsystems (Woburn. MA) ptn 
ring system tor printing arrays The pin ring com- 
bination consists ot a circular open ring oriented 
parallel tc tne sambie solution with a vertical pin 
centerea over tne nng When the ring is dipped 
into a solution anc; lifted, it withdraws an aliquot 
ot sample held bv surface tension. To spot the 
sample, the pin is driven down through The ring 
and a portion of the solution is transferred to the 
bottom of the pin The pin continues to move 
downward until the pendant drop of solution 
maices contact with the underlying surface. The 
pin is then lifted, and gravity and surface tension 
cause deposition of the spot onto the array. 
Figure from Flowers et a! | M), with permission 
from Genetic Microsystems. 

normalization, it is possible to determine the 
ratio ot fluorescent signals from a single 
hybridization of a slide- oased microarray. 

cDNA derived from control and treated 
populations of RNA is most commonly 
hybridized to arravs. although subtractive 
nvDndizarion or differential displav reacnons 
may also oe used. FiuoroDtiore- or radiola- 
Deieo nucieotiaes are airecuv incorporated 
into tne cDNA in tne process of converting 
RNA to cDNA. Alternatively. 5 end-labded 
primers mav be used for cDNA synthesis. 
These are labeled with a fluorophore for 
direct visualization of the hvbndiied array. 
Alternatively, biotin or a hapten mav be 
artached to the primer, in which case fluor- 
labeled srreptavidin or antibody must be 
applied before a signal can be generated. The 
most commonly used fluorophores at present 
are cyanine (Cy)3 and Cy5 (Amersham 
Pharmacia Biotech AB. Uppsala, Sweden). 
However, the relative expense of these fluo- 
rescent conjugates has driven a search for 
cheaper alternatives. Fluorescein, rhodamine, 
and Texas red have all been used, and 
companies such as Molecular Probes. Inc. 
fEugene. OR) are developing a series of 
labeled nucleotides with a wide range of exci- 
tation and emission spectra which may prove 
to function as well as the Cv dves. 
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Table 1. Advantages and disadvantages of different mtcroarray scannmg svstems 
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Analysis of DNA Microarrays 

Membrane- based arrays are normally analyzed 
on film or with a phosphonmager, whereas 
duo-based arravs require more specialized scan- 
ning devices. These can be divided into three 
main groups: the charge-coupled device camera 
svstems. the nonconrocal laser scanners, and the 
conibcal laser scanners- The advantages and dis- 
advantages of each system are listed in Table 1. 

Because a typical spot on a microarray can 
contain > 10 s molecules, it is dear that a large 
variation in signal strength may occur. 
Current scanners cannor work across this 
many orders of magmrude (h or 5 is more typ- 
ical). However, the scanning parameters can 
normally be adjusted to collect more or less 
signal, such that two or three scans of the same 
array should permit the detection or rare and 
abundant genes. 

When a microarray is scanned, the fluores- 
cent images are caprured by software normally 
induded with the scanner. Several commercial 
suppliers provide additional software for quan- 
tifying array images, but the software tools are 
constantly evolving to meet the developing 
needs of researchers, and it is prudent to 
define one's own needs and clarify the exact 
capabilities of the software before its purchase. 
Issues that should be considered indude the 
following: 

• Can the software locate offset spots? 

• Can it quantitate across irregular hybridiza- 
tion signaisr 

• Can the arraved genes be programmed in for 
easy identification and location? 

• Can the software connect via the Internet to 
databases containing further lnrormation on 
the genefs) of interest? 

One of the key issues raised at the work- 
shop was the sensitivity of microarray technol- 
ogy. Experiments by General Scanning. Inc. 
\VaterTDwn. MA), have shown that by using 
rne Cy dyes and their scanner, signal can be 
detected down to levels of < 1 fluor molecule 
per square micrometer, which translates to 
detecting a rare message at approximatdy one 
copy per cell or less. 

Array Applications 

Although arrays are an emerging technology 
certain to undergo improvement and 
alteration, they have already been applied use- 
fully to a number of model systems. Arrays are 
at their most powerful when they contain the 
entire genome of the spedes they are being 
used to study. For this reason, they have strong 
support among researchers utilizing yeast and 
Cdenorhabdita eUgans (5). The genomes of 
both of these species have been sequenced and, 
in the case of veast, deposited onto arrays for 
examination of gene expression (6,7). With 
both of these species, it is relatively easy to 
perturb individual gene expression. Indeed. C 



CCD. cnarge-couoieo device 
From Kiwisaki < 131 

elegans knockouts can be made simply by 
soaking the worms in an antisense solution of 
the gene to be knocked out. 

By a process of systematic gene disrup- 
tion, it is now possible to examine the cause 
and effect relationships between different 
genes in these simple organisms. This kind of 
approach should help elucidate biochemical 
pathways and genetic control processes, 
deconvolute polygenic interactions, and 
define the architecture of the cellular network. 
A simple case study of how this can be 
achieved was presented by Butow [University 
of Texas Southwestern Medical Center. 
Dallas. TX (Figure 2)]. Although it is the 
phenorypic result of a single gene knockout 
that is being examined, the effect of such 
perturbation will almost alwavs be polygenic. 
Polygenic interactions will become increasing- 
ly important as researchers begin to move 
away from single gene systems when examin- 
ing the nature of toxicologic responses to 
external stimuli. This is especially important 
in toxicology because the phenorype pro- 
duced by a given environmental insult is 
never the result of the action of a single gene; 
rather, it is a complex interaction of one or 
multiple cellular pathways. Phenomena such 
as quantitative trait (the continuous variation 
of phenorype), eoisosis ithe errecr or aiieies or 
one or more genes on the expression or otna 
genes), and penetrance (proportion of indi- 
viduals of a given genotype that display a par- 
ticular phenorype) will become increasingly 
evident and important as toxicologists push 
toward the ultimate goal of matching the 
responses of individuals to different 
environmental stimuli. 

Analysis of the transenptome (the expres- 
sion level of all the genes in a given cell popula- 
tion) was a use of arrays addressed by several 
speakers. Uniortunatdy, current gene nomen- 
clature is often confusing in that single genes 
are allocated multiple names (usually as a result 
of independent discovery by different laborato- 
ries), and there was a call for standardization of 
gene nomenclature. Neverthdess, once a tran- 
senptome has been assembled it can then be 
transrerred onto arrays and used to screen any 
chosen system. The EPA MicroArray 
Consortium (EPAMAC) is assembling testes 



transenptomes for human, rat. and mouse. In 3 
slighdv different approacn. Nuwavsir et al t£l 
describes how the NIEHS assembled what is 
errecrively a "toxicoiogical transenptome" — a 
library of human and mouse genes that have 
previously been proven or implicated in 
responses to toxicologic insults. Clontech 
Laboratories. Inc. (Palo .Alto, CA). has begun a 
similar process by- developing stress/ toxicology- 
filter arrays of rat, mouse, and human genes. 
Thus, rather than being tissue or cell specific 
these stress/ toxicology arrays can be used across 
a vanerv ot model systems to look for alter- 
ations in the expression of toxicologicallv 
important genes and define the new field of 
toxicogenomics. The potential to identify toxi- 
cant families based on tissue- or cell-specific 
gene expression could revolunonue drug test- 
ing. These molecular signatures or fingerprints 
could not only point to the possible 
toxicirv/carcinogeniciry of newly discovered 
compounds (Figure 3), but also aid in elucidat- 
ing their mechanism of action through identifi- 
cation of gene expression networks. By exten- 
sion, such signatures could provide easilv iden- 
tifiable biomarkers to assess the degree, time, 
and nature of exposure. 

DNA arrays are pnmarilv a tool for exam- 
ining differential gene expression in a given 
mood. In this context tnev are meii e u to as 
dosca svstems because tnev lack the abiiirv of 
other Differential expression technologies, e.g., 
differential display and subtractive hybridiza- 
tion, to detect previously unknown genes not 
present on the arrav. This would appear to 
limit the power of DNA arrays to the imagina- 
tions and preconceptions of the researcher in 
selecting genes previously characterized and 
thought to be involved in the modd system. 
However, the various genome sequencing pro- 
jects have created a new category of 
sequence — the EST — that has partially molli- 
fied this deficiency. ESTs are cDNAs expressed 
in a given tissue that, although they may share 
some degree of sequence similarity to previous- 
ly characterized genes, have not been assigned 
specific genetic identity. By incorporating EST 
dones into an array, it is possible to monitor 
the expression of these unknown genes. This 
can enable the identification of previously 
uncharactenzed genes that may have biologic 



significance in the model svstem. Filter arrays 
from Research Generics and suae arravs from 
inevtc Pharmaceuticals both incorporate iaree 
numpers or ESTs rrom a vancrv or species 

A runner use or rmomrravs is the identifi- 
cation or single nucleotide polymorphisms 
.SNPs;. These genomic variations are abun- 
danr — chev occur apprramarefv everr 1 kb or 
so — and are the basis or restriction fragment 
length poivmorphism analysis used in forensic 
analysis. Affymetru. Inc. desiened chips that 
contain multiple rcpears of tne same gene 
sequence. Each position is present witn ali rour 
possible bases. After tne hybridization of the 
sample, the degree of hvbricuzanon ro the dif- 
ferent sequences can be measured and the exact 
sequence or tne target gene aeduced. SNPs are 
thought to be or vital importance in drue 
metabolism and toxicology. For example, sin- 
gle case differences in the reeuiatorv region or 
active sire of some senes can account for huee 
differences in the actmrv of chat eene. Such 
SNPs are thought ro expiajn whv some people 
are aoie to metaooiize certain xenobioncs ber- 
ter than others. Thus, arravs provide a runner 
tool for the toxicologist investigatinc the 
nature of susceptible suDpopuiations and toxi- 
cologic response. 

There are still manv wrinkles to be ironed 
out before arravs become a standard tool for 
toxicologists. The main issues raised at rhe 
workshop by chose with hands-on experience 
were tne following: 

• Expense: the cost of purchasing'contracring 
this technology is still too great for manv 
individual laboratories. 




1J_ 

• mm 

Figure 2. Potential effects of gene knockout within 
positively and negatively regulated gene expression 
networks. is limiting in wild type for expression of 
t : 1 4) A simple, two-component, linear regulatory 
nerwor* operating on gene ^ where /, is a positive 
effector of ^ and j n is erther a positive or negative 
erfe:ior of /, This network could be deduced by 
examining the conseouence of Iff) deleting Jn on rhe 
expression of l} and ^ where the expression of L 
would be decreased or increased depending on 
whether was a positive or negative regulator. 
These and other connected components of even 
greater complexity could be revealed by genome- 
wide expression anarysis. From Butow I /5). 
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• Clones: the iogisncs of idcntinms. obtaining, 
and maintaining a set of nonreaundan:. non- 
con tarn mated, scouence-venhec:. species/ ceil 
tissue' n el a -specific a ones. 

• Lse or inorea strains: where whole-organism 
modeis are being used, rhe use of inored 
strains is important to reduce tne potentially 
confusing ertects of the individual variation 
typically seen in outbred popuiauons. 

• Probe: the need for relatively large amounts 
of RNA. which limits the rvpe of sample 
(e.g.. biopsvi chat can be used. .Also, different 
RNA extraction metnods can give different 
results. 

• Specificity: the ability to discriminate accu- 
rately berween closely related genes teg., the 
cvTochrome r>nO rarniiv) and soiice variants. 

• Quantitation: the quantitation of gene 
expression using gene arravs is still open to 
debate. One reason for this is the different 
incorporation of the labeling dves. However, 
the main difficulty lies in knowing what to 
normalize against. One opuon is ro include a 
large numoer of so-called housekeeping genes 
in the arrav. However, the expression of these 
genes orten change depending on the tissue 
and the toxicant, so it is necessary to charac- 
terize the expression of these genes in the 
model system before utilizing them. This is 
clearlv not a viable option when screening 
multiple new compounds. A second option 
is to include on the arrav genes rrom a nonre- 
lated species te.g . a plant gene on an animal 
arravj and to spike the probe with svntheuc 
RNA(s) complementary to the geneis). 

• Reproducibility: this is sometimes question- 
able, and a figure of approximately rwo or 
three repeats was used as the minimum num- 
ber required to confirm initial findings. 



Again, however, most peor.: ^ ■» 
use or Northern oiots or r^ersr ::anwr:r:^r 
PuR to conhrm r:nu;p.ci 

• Sensitivirv: concerns w-:r voicrj a?ou: :r.r 
'number of target moiecuies that mus: oe rrr- 

senr m a sampie for them to Pe detecrec on 
the array. 

• Emciehcv: reproducible identification of 1 

to 2- told differences in expression was repon- 
ed. although the numoer of genes that 
unaergo this level of cnange and remain 
undetected is open to debate* It is important 
that this level of detection o c uitimateiv 
achieved because it is commoniv perched 
mat some important transcription factors 
and cneir regulators respond at sucn iou 
eis. In most cases. 3- tc Wold was me mini- 
mum change that most were happy to 
accept. 

• Bioinfornutics: perhaps the greatest concern 
was how to accurately interpret tne data with 
cne greatest accuracy jn d efficient-. The 
biggest headache is trving to identify net- 
works of gene expression that arc common to 
different treatments or doses. The amount of 
data from a single experiment is huge. It mav 
be chat, in tne future, several groups individ- 
ually equipped with specialized software algo- 
rithms tor studying their favorite genes or 
gene systems will be able to share the same 
hvbndized chips. Thus, arravs could usher in 
a new perspective on collaboration and the 
sharing of data. 

EPAMAC 

Perhaps the mjin reason most scientists axe 
unable to use arrav technology is the high cost 
involved, whether buving off-the-shelf mem- 
branes, using contract printing services, or 
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producing chips ln-house. In view or this, 
researcners at en? RTD'NHEERL initiated 
the EPAMAC. This consortium brings 
toeether scientists rrom tnc EPA and a num- 
ber of extramural labs with the ajm of devel- 
oping microarrav capabiiiry throueh the shar- 
ing of resources and data. EPAMAC 
researchers are primarily interested in the 
developmental and toxicologic changes seen 
in testicular and breast tissue, and a portion 
of the workshop was ser aside for EPAMAC 
members to share their ideas on how the 
experimental application or microarravs could 
facilitate their research. One or the central 
areas of interest to EPAMAC members is the 
effect of xenobiotics on male fertility and 
reproductive health. Of greatest concern is 
the effect of exposure during critical periods 
of development and germ cell differentiation 
(2), and how this may compromise sperm 
counts and quality following sexual matura- 
tion (10). As well as spermatogenic tissue, 
there is also interest in how residual mRNA 
found in mature sperm [ID could be used as 
an indicator of previous xenobiotic effects l it 
is easier to obtain a semen sample than a tes- 
ticular biopsvj. Arravs will be used to examine 
and compare the effect of exposure to heat 
and chemicals in testicular and epididymal 
sene expression profiles, with the aim of 
establishing relationships/associations 
between changes in develop mental landmarks 
and the effects on sperm count and quality. 
Cluster, pattern, and other analysis of such 
data should help identify hidden relationships 
between genes that may reveal potential 
mechanisms of action and uncover roles for 
genes with unknown functions. 

Summary 

The full impact of DNA arravs mav nor be 
>ccn for several van. but the interest snown at 
:nis rraonai workshop indicates the high level 
or interest that thev roster. .Apart rrom educat- 
ing and advertising the various technoioees in 
mis field, this workshop brought together a 
number of researchers from the Research 
Triangle Park area who are already using DNA 
arrays. The interest in sharing ideas and experi- 
ences led to the initiation or a Triangle array 
user s group. 



Arrav technology is still in its infancy. This 
means that tne nardware is still improving and 
there is no current consensus ror standard pro- 
cedures, quantitation, and interpretation. 
Consistency in spotting and scanning arravs is 
not vet optimized, and thus is one of the most 
critical requirements of anv experiment. ln_ 
addition, one of the dark regions of arrav tech- 
nology — strife in the courts over who owns 
what portions of u — has further muddled the 
ruture and is a potential barrier toward the 
development of consensus procedures. 

Perhaps the greatest hurdle for the applica- 
tion of arravs is the actual interpretation of 
data. No specialists in bioinformatics attended 
the workshop, largely because thev are rare and 
because as vet no one seems clear on the best 
method of approaching data analvsis and inter- 
pretation. Cross-referencing results from mul- 
tiple experiments (time, dose, repeats, different 
animals, different species) to identify common- 
ly expressed genes is a great challenge. In most 
cases, we are still a long way from understand- 
ing how the expression of gene X is related to 
the expression of gene Y, and ordering gene 
expression to delineate causal relationships. 

To the ordinary scientist in the tvpical lab- 
oratory, however, the most immediate prob- 
lem is a lack of affordable instrumentation. 
One can purchase premade membranes at 
relatively affordable prices. Although these 
may be useful in identifying individual genes 
to pursue in more detail using other methods, 
the numbers that would be required for even a 
small routine toxicology experiment prohibit 
this as a truly viable approach. For the toxicol- 
ogist, there is a need to earn- out multiple 
experiments — dose responses, time curves, 
multiple animals, and repeats. Glass-based 
DNA arravs are most attractive in this context 
because thev can be prepared in larce batches 
from the same DNA source and accommo- 
date control and orated samotes on tnc same 
chip. Anotner prooiem witn current off-the- 
shelf arravs is that thev often do not contain 
one or more of the particular senes a eroup is 
interested in. One alternative is to obtain 
and/or produce a set of custom clones and 
have contract printing of membranes or slides 
carried out by a company such as Genomic 
Soluuons, Inc (Ann Arbor, MI). This approach 
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is less expensivr :har. uvir.z ou: • 
one s own entire s\ ^terr.. a;:hous;r. a: >o~; 
point it miem maxc economy >ense :^ 
one s own array*. 

Finallv. DNA arravs are currently a team 
errort. Thev are a teennoioev that uses i wicr 
-ranee or skills tnaudinc engineering, statistics, 
molecular bioiop . cnemisrrv. and biotnror- 
matics. Because most individuals are skilled in 
onlv one or perhaps rwo or these areas, it 
appears that success with arravs mav be best 
expected bv teams or collaborators consisting 
ot individuals having each or" tnese skjIIs. 

Those considering arrav applications mav 
be amused or goaded on bv tne following 
quote rrom Forrunr magazine i 

Microprocessors nave resruoea our econonu. 
spawned vast ro prunes and evinced tne ua\ *c live 
Gene chips couia dc even Digger. 

Although this comment may have been 
designed to excire the imagination rather than 
accurately reflect the truth, it is fair to say that 
the age ot functional genomics is upon us. 
DNA arravs look set to be an important tool in 
this new age of bio tech no log}- and will likely 
contribute answers to some of toxicology's 
most fundamental questions. 
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Date: Mon. 3 Jul 2000 08:09:45 -0400 
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You car. see tne _ist c: c.cnes chat we have cr. our 1 2 > chir at 

mar.ue. .mens . r.ir. . :r: maps rues: cl cr.esrcr. . cfr 
We selected a suose: cf genes (2000K) :.:a: we oe.ieved critical tc 
response and basic cellular processes and added a set cf clcr.es arc ZS~s t: 
tns. We have included a set cf ccncrcl genes i80-> cnat were selectee" r" 
the ICHGP.I oecause tney did net change across a large set cf arrav 
experiments. However, we nave found that some cf tnese genes crkr.ee 
signf leant ly after tox treatments and are m the process cf loor.ir.c at tre 
variation cf each of tnese 80* genes across our experiments. 
Our chips are constantly chancing and being updated and we hope tnat cur 
data will lead us to what the toxchip should really be. 
I hope this answers your question. 
Cindy Afshan 



> From: Diana Hamlet -Cox 

> Sent: Monday, June 26, 2000 8:52 PM 

> To: afshari&mehs .nih.gov 

> Subject: [Fwd: Toxicology Chip] 
> 

> Dear Dr. Afshari, 
> 

> Since I have noz yet had a response from Bill Grigg, perhaps he was noz 

> zhe righz person zo contact. 
> 

> Can you help me in this matter? 1 don't need to know the sequences, 

> necessarily, but I would like very much to know what types of sequences 

> are being used, e.g., GPCRs (more specific?), ion channels, etc. 
> 

> Diana Hamlet-Cox 
> 

> Original Message 

> Subjecz: Toxicology Chip 

> Date: Mon, 19 Jun 2000 18:31:48 -0700 

> From: Diana Hamlet-Cox <dianahcQmcyte.com> 

> Organization: Zncyte Pharmaceuticals 

> To: gnggismehs .nih.gov 
> 

> Dear Colleague : 

> 

> 2 am dome literature research on zhe use cf expressed genes as 

> pharmacotoxicology markers, and found zhe Press Release dazed February 

> 29, 2000 regarding the work of the NIEHS m this area. 1 would like to 

> know if there is a resource I can access lor you could provide? ) znaz 

> would give me a list of the 12,000 genes zhaz are on your Human ToxChip 

> Mi cr oar ray. In particular, I am interested m zhe crizeria used zo 

> selecz sequences for the ToxChip, including any control sequences 

> included in the microarray . 
> 

> Thank you for your assistance in this requesz. 
> 

> Diana Hamlez-Cox, Ph.D. 

> Zncyte Genomics, Inc. 
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> This e.Tj;I message is zor zhe sole use cf zr.e ir.zer.ded rer;r;e.-.: 5 

> may cor.zair. czr.fider.zial and privileged irzzzrzazicr s-r^err : r 

> a: zcrr.ey-clier.z privilege . Arry -j.-* _ 1 "rri rer rev^e;**. use. czsrJrsure 

> c:s:r:b::;:.: .s pre- it 1 zed If you are r.c z zne ir.zer.ded rezipi e.zz 

> please ccr.zarz zr.e ser.der ry reply er^il &rd deszrzy all rczies z: : 

> crigiral message. 

> ========================= 

> 

> 
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